diff --git a/MAINTAINER_ONBOARDING.md b/MAINTAINER_ONBOARDING.md index 16fae83a8a..38093f438b 100644 --- a/MAINTAINER_ONBOARDING.md +++ b/MAINTAINER_ONBOARDING.md @@ -58,11 +58,11 @@ _When you see "ALL SYSTEMS GO!", your workspace is ready._ Once initialized, you can launch tasks directly through `npm` or the entry point: -- **Review a PR**: `npm run workspace review` -- **Launch a Shell**: `npm run workspace:shell ` -- **Check Status**: `npm run workspace:status` -- **Cleanup All**: `npm run workspace:clean-all` -- **Kill Task**: `npm run workspace:kill ` +- **Review a PR**: `workspace review` +- **Launch a Shell**: `workspace:shell ` +- **Check Status**: `workspace:status` +- **Cleanup All**: `workspace:clean-all` +- **Kill Task**: `workspace:kill ` - **Stop Worker**: `npx tsx scripts/workspaces.ts fleet stop` (Recommended when finished to save cost). diff --git a/extensions/workspaces/docs/README.md b/extensions/workspaces/docs/README.md index 2561e8b222..fe5edc4fcf 100644 --- a/extensions/workspaces/docs/README.md +++ b/extensions/workspaces/docs/README.md @@ -1,90 +1,120 @@ # Workspace maintainer skill The `workspace` skill provides a high-performance, parallelized workflow for -workspaceing intensive developer tasks to a remote workstation. It leverages a -Node.js orchestrator to run complex validation playbooks concurrently in a +workspaceing intensive developer tasks to a remote workstation. It leverages a +Node.js orchestrator to run complex validation playbooks concurrently in a dedicated terminal window. ## Why use workspace? As a maintainer, you eventually reach the limits of how much work you can manage at once on a single local machine. Heavy builds, concurrent test suites, and -multiple PRs in flight can quickly overload local resources, leading to +multiple PRs in flight can quickly overload local resources, leading to performance degradation and developer friction. While manual remote management is a common workaround, it is often cumbersome -and context-heavy. The `workspace` skill addresses these challenges by providing: +and context-heavy. The `workspace` skill addresses these challenges by +providing: -- **Elastic compute**: Workspace resource-intensive build and lint suites to a - beefy remote workstation, keeping your local machine responsive. -- **Context preservation**: The main Gemini session remains interactive and - focused on high-level reasoning while automated tasks provide real-time - feedback in a separate window. -- **Automated orchestration**: The skill handles worktree provisioning, - script synchronization, and environment isolation automatically. -- **True parallelism**: Infrastructure validation, CI checks, and behavioral - proofs run simultaneously, compressing a 15-minute process into 3 minutes. +- **Elastic compute**: Workspace resource-intensive build and lint suites to a + beefy remote workstation, keeping your local machine responsive. +- **Context preservation**: The main Gemini session remains interactive and + focused on high-level reasoning while automated tasks provide real-time + feedback in a separate window. +- **Automated orchestration**: The skill handles worktree provisioning, script + synchronization, and environment isolation automatically. +- **True parallelism**: Infrastructure validation, CI checks, and behavioral + proofs run simultaneously, compressing a 15-minute process into 3 minutes. ## Agentic skills: Sync or Workspace -The `workspace` system is designed to work in synergy with specialized agentic +The `workspace` system is designed to work in synergy with specialized agentic skills. These skills can be run **synchronously** in your current terminal for -quick tasks, or **workspaceed** to a remote session for complex, iterative loops. +quick tasks, or **workspaceed** to a remote session for complex, iterative +loops. -- **`review-pr`**: Conducts high-fidelity, behavioral code reviews. It assumes - the infrastructure is already validated and focuses on physical proof of - functionality. -- **`fix-pr`**: An autonomous "Fix-to-Green" loop. It iteratively addresses - CI failures, merge conflicts, and review comments until the PR is mergeable. +- **`review-pr`**: Conducts high-fidelity, behavioral code reviews. It assumes + the infrastructure is already validated and focuses on physical proof of + functionality. +- **`fix-pr`**: An autonomous "Fix-to-Green" loop. It iteratively addresses CI + failures, merge conflicts, and review comments until the PR is mergeable. -When you run `npm run workspace fix`, the orchestrator provisions the remote +When you run `workspace fix`, the orchestrator provisions the remote environment and then launches a Gemini CLI session specifically powered by the `fix-pr` skill. ## Architecture: The Hybrid Powerhouse -The workspace system uses a **Hybrid VM + Docker** architecture designed for maximum performance and reliability: +The workspace system uses a **Hybrid VM + Docker** architecture designed for +maximum performance and reliability: -1. **The GCE VM (Raw Power)**: By running on high-performance Google Compute Engine instances, we workspace heavy CPU and RAM tasks (like full project builds and massive test suites) from your local machine, keeping your primary workstation responsive. +1. **The GCE VM (Raw Power)**: By running on high-performance Google Compute + Engine instances, we workspace heavy CPU and RAM tasks (like full project + builds and massive test suites) from your local machine, keeping your + primary workstation responsive. 2. **The Docker Container (Consistency & Resilience)**: - * **Source of Truth**: The `.gcp/Dockerfile.maintainer` defines the exact environment. If a tool is added there, every maintainer gets it instantly. - * **Zero Drift**: Containers are immutable. Every job starts in a fresh state, preventing the "OS rot" that typically affects persistent VMs. - * **Local-to-Remote Parity**: The same image can be run locally on your Mac or remotely in GCP, ensuring that "it works on my machine" translates 100% to the remote worker. - * **Safe Multi-tenancy**: Using Git Worktrees inside an isolated container environment allows multiple jobs to run in parallel without sharing state or polluting the host system. + - **Source of Truth**: The `.gcp/Dockerfile.maintainer` defines the exact + environment. If a tool is added there, every maintainer gets it instantly. + - **Zero Drift**: Containers are immutable. Every job starts in a fresh + state, preventing the "OS rot" that typically affects persistent VMs. + - **Local-to-Remote Parity**: The same image can be run locally on your Mac + or remotely in GCP, ensuring that "it works on my machine" translates 100% + to the remote worker. + - **Safe Multi-tenancy**: Using Git Worktrees inside an isolated container + environment allows multiple jobs to run in parallel without sharing state + or polluting the host system. ## Playbooks -- **`review`** (default): Build, CI check, static analysis, and behavioral proofs. -- **`fix`**: Iterative fixing of CI failures and review comments. -- **`ready`**: Final full validation (clean install + preflight) before merge. -- **`open`**: Provision a worktree and drop directly into a remote tmux session. +- **`review`** (default): Build, CI check, static analysis, and behavioral + proofs. +- **`fix`**: Iterative fixing of CI failures and review comments. +- **`ready`**: Final full validation (clean install + preflight) before merge. +- **`open`**: Provision a worktree and drop directly into a remote tmux session. ## Scenario and workflows ### Getting Started (Onboarding) -For a complete guide on setting up your remote environment, see the [Maintainer Onboarding Guide](../../../MAINTAINER_ONBOARDING.md). + +For a complete guide on setting up your remote environment, see the +[Maintainer Onboarding Guide](../../../MAINTAINER_ONBOARDING.md). ### Persistence and Job Recovery -The workspace system is designed for high reliability and persistence. Jobs use a nested execution model to ensure they continue running even if your local terminal is closed or the connection is lost. +The workspace system is designed for high reliability and persistence. Jobs use +a nested execution model to ensure they continue running even if your local +terminal is closed or the connection is lost. ### How it Works -1. **Host-Level Persistence**: The orchestrator launches each job in a named **`tmux`** session on the remote VM. -2. **Container Isolation**: The actual work is performed inside the persistent `maintainer-worker` Docker container. + +1. **Host-Level Persistence**: The orchestrator launches each job in a named + **`tmux`** session on the remote VM. +2. **Container Isolation**: The actual work is performed inside the persistent + `maintainer-worker` Docker container. ### Re-attaching to a Job + If you lose your connection, you can easily resume your session: -- **Automatic**: Simply run the exact same command you started with (e.g., `npm run workspace 123 review`). The system will automatically detect the existing session and re-attach you. -- **Manual**: Use `npm run workspace:status` to find the session name, then use `ssh gcli-worker` to jump into the VM and `tmux attach -t ` to resume. +- **Automatic**: Simply run the exact same command you started with (e.g., + `workspace 123 review`). The system will automatically detect the existing + session and re-attach you. +- **Manual**: Use `workspace:status` to find the session name, then use + `ssh gcli-worker` to jump into the VM and `tmux attach -t ` to + resume. ## Technical details -This skill uses a **Worker Provider** abstraction (`GceCosProvider`) to manage the remote lifecycle. It uses an isolated Gemini profile on the remote host (`~/.workspace/gemini-cli-config`) to ensure that verification tasks do not interfere with your primary configuration. +This skill uses a **Worker Provider** abstraction (`GceCosProvider`) to manage +the remote lifecycle. It uses an isolated Gemini profile on the remote host +(`~/.workspace/gemini-cli-config`) to ensure that verification tasks do not +interfere with your primary configuration. ### Directory structure + - `scripts/providers/`: Modular worker implementations (GCE, etc.). -- `scripts/orchestrator.ts`: Local orchestrator (syncs scripts and pops terminal). +- `scripts/orchestrator.ts`: Local orchestrator (syncs scripts and pops + terminal). - `scripts/worker.ts`: Remote engine (provisions worktree and runs playbooks). - `scripts/check.ts`: Local status poller. - `scripts/clean.ts`: Remote cleanup utility. @@ -93,15 +123,19 @@ This skill uses a **Worker Provider** abstraction (`GceCosProvider`) to manage t ## Contributing If you want to improve this skill: + 1. Modify the TypeScript scripts in `scripts/`. 2. Update `SKILL.md` if the agent's instructions need to change. -3. Test your changes locally using `npm run workspace `. +3. Test your changes locally using `workspace `. ## Testing The orchestration logic for this skill is fully tested. To run the tests: + ```bash npx vitest .gemini/skills/workspace/tests/orchestration.test.ts ``` -These tests mock the external environment (SSH, GitHub CLI, and the file system) to ensure that the orchestration scripts generate the correct commands and handle environment isolation accurately. +These tests mock the external environment (SSH, GitHub CLI, and the file system) +to ensure that the orchestration scripts generate the correct commands and +handle environment isolation accurately. diff --git a/extensions/workspaces/docs/plan.workerabstraction.md b/extensions/workspaces/docs/plan.workerabstraction.md index 7f63f8eb51..ac4131c2c0 100644 --- a/extensions/workspaces/docs/plan.workerabstraction.md +++ b/extensions/workspaces/docs/plan.workerabstraction.md @@ -1,26 +1,40 @@ # Plan: Worker Provider Abstraction for Workspace System ## Objective -Abstract the remote execution infrastructure (GCE COS, GCE Linux, Cloud Workstations) behind a common `WorkerProvider` interface. This eliminates infrastructure-specific prompts (like "use container mode") and makes the system extensible to new backends. + +Abstract the remote execution infrastructure (GCE COS, GCE Linux, Cloud +Workstations) behind a common `WorkerProvider` interface. This eliminates +infrastructure-specific prompts (like "use container mode") and makes the system +extensible to new backends. ## Architectural Changes ### 1. New Provider Abstraction -Create a modular provider system where each infrastructure type implements a standard interface. -- **Base Interface**: `WorkerProvider` (methods for `exec`, `sync`, `provision`, `getStatus`). + +Create a modular provider system where each infrastructure type implements a +standard interface. + +- **Base Interface**: `WorkerProvider` (methods for `exec`, `sync`, `provision`, + `getStatus`). - **Implementations**: - - `GceCosProvider`: Handles COS with Cloud-Init and `docker exec` wrapping. - - `GceLinuxProvider`: Handles standard Linux VMs with direct execution. - - `LocalDockerProvider`: (Future) Runs workspace tasks in a local container. - - `WorkstationProvider`: (Future) Integrates with Google Cloud Workstations. + - `GceCosProvider`: Handles COS with Cloud-Init and `docker exec` wrapping. + - `GceLinuxProvider`: Handles standard Linux VMs with direct execution. + - `LocalDockerProvider`: (Future) Runs workspace tasks in a local container. + - `WorkstationProvider`: (Future) Integrates with Google Cloud Workstations. ### 2. Auto-Discovery + Modify `setup.ts` to: -- Prompt for a high-level "Provider Type" (e.g., "Google Cloud (COS)", "Google Cloud (Linux)"). -- Auto-detect environment details where possible (e.g., fetching internal IPs, identifying container names). + +- Prompt for a high-level "Provider Type" (e.g., "Google Cloud (COS)", "Google + Cloud (Linux)"). +- Auto-detect environment details where possible (e.g., fetching internal IPs, + identifying container names). ### 3. Clean Orchestration + Refactor `orchestrator.ts` to be provider-agnostic: + - It asks the provider to "Ensure Ready" (wake VM). - It asks the provider to "Prepare Environment" (worktree setup). - It asks the provider to "Launch Task" (tmux initialization). @@ -28,19 +42,28 @@ Refactor `orchestrator.ts` to be provider-agnostic: ## Implementation Steps ### Phase 1: Infrastructure Cleanup -- Move existing procedural logic from `fleet.ts`, `setup.ts`, and `orchestrator.ts` into a new `providers/` directory. -- Create `ProviderFactory` to instantiate the correct implementation based on `settings.json`. + +- Move existing procedural logic from `fleet.ts`, `setup.ts`, and + `orchestrator.ts` into a new `providers/` directory. +- Create `ProviderFactory` to instantiate the correct implementation based on + `settings.json`. ### Phase 2: Refactor Scripts -- **`fleet.ts`**: Proxy all actions (`provision`, `rebuild`, `stop`) to the provider. + +- **`fleet.ts`**: Proxy all actions (`provision`, `rebuild`, `stop`) to the + provider. - **`orchestrator.ts`**: Use the provider for the entire lifecycle of a job. - **`status.ts`**: Use the provider's `getStatus()` method to derive state. ### Phase 3: Validation + - Verify that the `gcli-worker` SSH alias and IAP tunneling remain functional. - Ensure "Fast-Path SSH" is still the primary interactive gateway. ## Verification -- Run `npm run workspace:fleet provision` and ensure it creates a COS-native worker. -- Run `npm run workspace:setup` and verify it no longer asks cryptic infrastructure questions. -- Launch a review and verify it uses `docker exec internally for the COS provider. + +- Run `workspace:fleet provision` and ensure it creates a COS-native worker. +- Run `workspace:setup` and verify it no longer asks cryptic infrastructure + questions. +- Launch a review and verify it uses `docker exec internally for the COS + provider. diff --git a/skills/workspaces/SKILL.md b/skills/workspaces/SKILL.md deleted file mode 100644 index 6f60f551f1..0000000000 --- a/skills/workspaces/SKILL.md +++ /dev/null @@ -1,39 +0,0 @@ ---- -name: workspaces -description: Expertise in managing and utilizing Gemini Workspaces for high-performance remote development tasks. ---- - -# Gemini Workspaces Skill - -This skill enables the agent to utilize **Gemini Workspaces**—a high-performance, persistent remote development platform. It allows the agent to move intensive tasks (PR reviews, complex repairs, full builds) from the local environment to a dedicated cloud worker. - -## 🛠️ Key Capabilities -1. **Persistent Execution**: Jobs run in remote `tmux` sessions. Disconnecting or crashing the local terminal does not stop the remote work. -2. **Parallel Infrastructure**: The agent can launch a heavy task (like a full build or CI run) in a workspace while continuing to assist the user locally. -3. **Behavioral Fidelity**: Remote workers have full tool access (Git, Node, Docker, etc.) and high-performance compute, allowing the agent to provide behavioral proofs of its work. - -## 📋 Instructions for the Agent - -### When to use Workspaces -- **Intensive Tasks**: Full preflight runs, large-scale refactors, or deep PR reviews. -- **Persistent Logic**: When a task is expected to take longer than a few minutes and needs to survive local connection drops. -- **Environment Isolation**: When you need a clean, high-performance environment to verify a fix without polluting the user's local machine. - -### How to use Workspaces -1. **Setup**: If the user hasn't initialized their environment, instruct them to run `npm run workspace:setup`. -2. **Launch**: Use the `workspace` command to start a playbook: - ```bash - npm run workspace [action] - ``` - - Actions: `review` (default), `fix`, `ready`. -3. **Check Status**: See global state and active sessions with `npm run workspace:status`, or deep-dive into specific PR logs with `npm run workspace:check `. -4. **Cleanup**: - - **Bulk**: Clear all sessions/worktrees with `npm run workspace:clean-all`. - - **Surgical**: Kill a specific PR task with `npm run workspace:kill `. -5. **Fleet**: Manage VM lifecycle with `npm run workspace:fleet [stop|provision|list]`. - -## ⚠️ Important Constraints -- **Absolute Paths**: Always use absolute paths (e.g., `/mnt/disks/data/...`) when orchestrating remote commands. -- **npx tsx**: When running scripts manually from the skill directory, always prefix with `npx tsx` to ensure dependencies are available. -- **Be Behavioral**: Prioritize results from live execution (behavioral proofs) over static reading. -- **Multi-tasking**: Remind the user they can continue chatting in the main window while the heavy workspace task runs in the separate terminal window.