diff --git a/packages/core/src/core/__snapshots__/prompts.test.ts.snap b/packages/core/src/core/__snapshots__/prompts.test.ts.snap index 438251ed1f..407fa3683e 100644 --- a/packages/core/src/core/__snapshots__/prompts.test.ts.snap +++ b/packages/core/src/core/__snapshots__/prompts.test.ts.snap @@ -715,7 +715,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -723,6 +723,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -855,7 +860,7 @@ Use the following guidelines to optimize your search and read patterns. # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Utilize specialized sub-agents (e.g., \`codebase_investigator\`) as the primary mechanism for initial discovery when the task involves **complex refactoring, codebase exploration or system-wide analysis**. For **simple, targeted searches** (like finding a specific function name, file path, or variable declaration), use \`grep_search\` or \`glob\` directly in parallel. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. @@ -863,6 +868,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -977,7 +987,7 @@ Use the following guidelines to optimize your search and read patterns. # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. @@ -985,6 +995,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -1613,7 +1628,7 @@ You have access to the following specialized skills. To activate a skill and rec # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -1621,6 +1636,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -1764,7 +1784,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -1772,6 +1792,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -1919,7 +1944,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -1927,6 +1952,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -2074,7 +2104,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -2082,6 +2112,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -2225,7 +2260,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -2233,6 +2268,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -2519,7 +2559,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use search tools extensively to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** If the request is ambiguous, broad in scope, or involves architectural decisions or cross-cutting changes, use the \`enter_plan_mode\` tool to safely research and design your strategy. Do NOT use Plan Mode for straightforward bug fixes, answering questions, or simple inquiries. 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -2527,6 +2567,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -2669,7 +2714,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -2677,6 +2722,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -3061,7 +3111,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -3069,6 +3119,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -3212,7 +3267,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -3220,6 +3275,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -3475,7 +3535,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -3483,6 +3543,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. @@ -3626,7 +3691,7 @@ For example: # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. 1. **Research:** Systematically map the codebase and validate assumptions. Use \`grep_search\` and \`glob\` search tools extensively (in parallel if independent) to understand file structures, existing code patterns, and conventions. Use \`read_file\` to validate all assumptions. **Prioritize empirical reproduction of reported issues to confirm the failure state.** 2. **Strategy:** Formulate a grounded plan based on your research. Share a concise summary of your strategy. @@ -3634,6 +3699,11 @@ Operate using a **Research -> Strategy -> Execution** lifecycle. For the Executi - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., \`replace\`, \`write_file\`, \`run_shell_command\`). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project. If unsure about these commands, you can ask the user if they'd like you to run them and if so how to. +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible. diff --git a/packages/core/src/prompts/snippets.ts b/packages/core/src/prompts/snippets.ts index 0de9b11e25..80dbd4b4d7 100644 --- a/packages/core/src/prompts/snippets.ts +++ b/packages/core/src/prompts/snippets.ts @@ -292,7 +292,7 @@ export function renderPrimaryWorkflows( # Primary Workflows ## Development Lifecycle -Operate using a **Research -> Strategy -> Execution** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. +Operate using a **Research -> Strategy -> Execution -> Review** lifecycle. For the Execution phase, resolve each sub-task through an iterative **Plan -> Act -> Validate** cycle. ${workflowStepResearch(options)} ${workflowStepStrategy(options)} @@ -300,6 +300,11 @@ ${workflowStepStrategy(options)} - **Plan:** Define the specific implementation approach **and the testing strategy to verify the change.** - **Act:** Apply targeted, surgical changes strictly related to the sub-task. Use the available tools (e.g., ${formatToolName(EDIT_TOOL_NAME)}, ${formatToolName(WRITE_FILE_TOOL_NAME)}, ${formatToolName(SHELL_TOOL_NAME)}). Ensure changes are idiomatically complete and follow all workspace standards, even if it requires multiple tool calls. **Include necessary automated tests; a change is incomplete without verification logic.** Avoid unrelated refactoring or "cleanup" of outside code. Before making manual code changes, check if an ecosystem tool (like 'eslint --fix', 'prettier --write', 'go fmt', 'cargo fmt') is available in the project to perform the task automatically. - **Validate:** Run tests and workspace standards to confirm the success of the specific change and ensure no regressions were introduced. After making code changes, execute the project-specific build, linting and type-checking commands (e.g., 'tsc', 'npm run lint', 'ruff check .') that you have identified for this project.${workflowVerifyStandardsSuffix(options.interactive)} +4. **Review:** After any body of work where you write more than one file or make non-trivial code changes, you MUST execute a final review step: + - **Analyze Diff:** If in a git repository, run \`git diff\` or \`git diff HEAD\` to visualize the exact changes made. + - **Verify Completeness:** Compare the work done against the user's original prompt. Ensure the intent was completely satisfied and the codebase has landed in the desired end state. If there are gaps, formulate follow-up tasks to address them. + - **Code Quality & Linting:** Verify that the generated code matches the codebase's linter preferences and any explicitly stated user preferences. + - **Targeted Refinements:** Apply best practice final touches. Look for opportunities to de-duplicate logic, break up excessively large functions into more focused ones, and generally manage code complexity. Follow general programming and ecosystem-specific best practices, keeping in mind that explicit user preferences take precedence. **Validation is the only path to finality.** Never assume success or settle for unverified changes. Rigorous, exhaustive verification is mandatory; it prevents the compounding cost of diagnosing failures later. A task is only complete when the behavioral correctness of the change has been verified and its structural integrity is confirmed within the full project context. Prioritize comprehensive validation above all else, utilizing redirection and focused analysis to manage high-output tasks without sacrificing depth. Never sacrifice validation rigor for the sake of brevity or to minimize tool-call overhead; partial or isolated checks are insufficient when more comprehensive validation is possible.