feat(skills): add agent-tui and tui-tester skills (#27121)

2026-05-19 08:14:35 -07:00 · 2026-05-18 11:46:02 -04:00
parent 84423e6ea1
commit 9d01958cdb
2 changed files with 395 additions and 0 deletions
@@ -0,0 +1,330 @@
+---
+name: agent-tui
+description: >
+  Main Agents: Do NOT use this skill directly. If you need to test the TUI, invoke the `tui_tester` subagent.
+  Drive terminal UI (TUI) applications programmatically for testing, automation, and inspection.
+  Use when: automating CLI/TUI interactions, regression testing terminal apps, or verifying interactive behavior.
+  Also use when: user asks "what is agent-tui", "what does agent-tui do", "demo agent-tui", "show me agent-tui", "how does agent-tui work", or wants to see it in action.
+---
+
+## 🚨 CRITICAL: macOS Daemon Workaround & Gemini CLI Usage 🚨
+
+When using `agent-tui` in this macOS environment, the default background daemonization process crashes, causing `Connection refused (os error 61)` errors. 
+
+**You MUST start the daemon manually shielded from TTY hangups before running any `agent-tui` commands.** Using `nohup` is insufficient; you must use `tmux` to provide a fully isolated pseudo-terminal. 
+
+To support parallel runs, **only restart the daemon if it is not currently running:**
+
+```bash
+# Check if daemon is alive, start it in tmux if it is not
+if ! agent-tui sessions >/dev/null 2>&1; then
+  tmux kill-session -t agent-tui 2>/dev/null || true
+  agent-tui daemon stop 2>/dev/null || true
+  rm -f /tmp/agent-tui*
+  tmux new-session -d -s agent-tui 'agent-tui daemon start --foreground > /tmp/agent-tui-daemon.log 2>&1'
+  sleep 1
+fi
+```
+
+### Session ID vs PID (Crucial for Reconnection)
+
+When `agent-tui run` returns JSON, it includes both a `session_id` and a `pid`. The `pid` is purely informational (the OS process ID of the child command). You **do not** use the `pid` to reconnect or issue commands. You must always use the `session_id` (e.g., `--session <id>`). 
+
+If the daemon crashes (`os error 61`), the pseudo-terminal is destroyed. Even if the child `pid` survives as an orphan, you cannot reconnect to it. You must restart the daemon using the workaround above and start a completely new session.
+
+### Testing the Gemini CLI
+
+When testing the Gemini CLI with `agent-tui`, there are several strict requirements to ensure deterministic and accurate behavior:
+
+1. **Build Before Running**: `agent-tui` runs the built JS files, not TypeScript. You **MUST** run `npm run build` or `npm run build:all` after making code changes and before launching the CLI with `agent-tui`.
+2. **Bypass Trust Modals**: Always pass `GEMINI_CLI_TRUST_WORKSPACE=true` in the environment. If you don't, any new project-level agents or extensions will trigger a full-screen "Acknowledge and Enable" modal. This modal steals focus, swallows automation keystrokes, and causes `agent-tui wait` commands to time out.
+3. **Isolated Environments**: If you need to test without real user credentials or existing agents interfering, isolate the global settings using `GEMINI_CLI_HOME=<some-test-dir>`.
+4. **Testing State Deltas (e.g., Reloads)**: If you are testing features that report deltas (e.g., `/agents reload` outputting "1 new local subagent"), you **MUST**:
+   - Start the CLI *first* so it establishes its baseline registry.
+   - Use a separate shell command (outside of `agent-tui`) to write the new agent `.md`/`.toml` file.
+   - Use `agent-tui type` and `press` to trigger the `/agents reload` command inside the running session.
+   - (If you add the files before starting the CLI, they become part of the baseline and won't trigger the delta logic).
+
+```bash
+# Example: Standard isolated run (sandboxed config + bypass trust modals)
+env GEMINI_CLI_TRUST_WORKSPACE=true GEMINI_CLI_HOME=test-gemini-home agent-tui run -d "$(pwd)" node packages/cli/dist/index.js
+```
+
+# Terminal Automation Mastery
+
+## Prerequisites
+
+- **Supported OS**: macOS or Linux (Windows not supported yet).
+- **Verify install**:
+
+```bash
+agent-tui --version
+```
+
+If not installed, use one of:
+
+```bash
+# Recommended: one-line install (macOS/Linux)
+curl -fsSL https://raw.githubusercontent.com/pproenca/agent-tui/master/install.sh | sh
+```
+
+```bash
+# Package manager
+npm i -g agent-tui
+pnpm add -g agent-tui
+bun add -g agent-tui
+```
+
+```bash
+# Build from source
+cargo install --git https://github.com/pproenca/agent-tui.git --path cli/crates/agent-tui
+```
+
+If you used the install script, ensure `~/.local/bin` is on your PATH.
+
+## Philosophy: Why Terminal Automation Is Different
+
+Terminal UIs are **stateless from the observer's perspective**. Unlike web browsers with a persistent DOM, terminal automation works with a constantly-refreshed character grid. This fundamental difference shapes everything:
+
+| Web Automation | Terminal Automation |
+|----------------|---------------------|
+| DOM persists across interactions | Screen buffer is redrawn constantly |
+| Selectors are stable | Text positions may shift |
+| Query once, act many times | Must re-verify before EVERY action |
+| Network events signal completion | Must detect visual stability |
+
+**The Core Insight**: agent-tui gives you vision without memory. Each screenshot is a fresh observation. Previous state means nothing after the UI changes. This isn't a limitation—it's the nature of terminal interaction.
+
+## Mental Model: The Feedback Loop
+
+Think of terminal automation as a **closed-loop control system**:
+
+```
+    ┌──────────────────────────────────────────────┐
+    │                                              │
+    ▼                                              │
+OBSERVE ──► DECIDE ──► ACT ──► WAIT ──► VERIFY ───┘
+   │                                        │
+   │                                        │
+   └─────── NEVER skip ◄────────────────────┘
+```
+
+**Each phase is mandatory.** Skipping verification is the #1 cause of flaky automation.
+
+### The "Fresh Eyes" Principle
+
+Every time you need to interact with the UI:
+
+1. **Take a fresh screenshot** — your previous one is now stale
+2. **Locate your target visually** — text positions may have changed
+3. **Verify the state** — the UI may have changed unexpectedly
+4. **Act only when stable** — animations and loading states cause failures
+
+This feels slower, but it's the only reliable approach. Optimistic reuse of stale state causes intermittent failures that are painful to debug.
+
+## Critical Rules (Non-Negotiable)
+
+> **RULE 1: Atomic Execution (No Pipelining)**
+> You are FORBIDDEN from chaining commands with `&&` (e.g., `type "x" && press Enter && wait`). Modals or UI updates can intercept your keystrokes. You MUST execute one atomic action, wait, screenshot, and verify before taking the next action in a new turn.
+
+> **RULE 2: Re-snapshot after EVERY action**
+> The UI state is invalidated by any change. Always take a fresh screenshot before acting again.
+
+> **RULE 3: Never act on unstable UI**
+> If the UI is animating, loading, or transitioning, `wait --stable` first. Acting during transitions because race conditions.
+
+> **RULE 4: Verify before claiming success**
+> Use `wait "expected text" --assert` to confirm outcomes. Don't assume an action worked—prove it.
+
+> **RULE 5: Error Recovery**
+> If a `wait` command times out, DO NOT blindly restart or kill the session. Execute `screenshot` to visually diagnose what unexpected UI element (modal, error dialog, lost focus) intercepted the flow.
+
+> **RULE 6: Clean up sessions**
+> Always end with `agent-tui kill`. Orphaned sessions consume resources and can interfere with future runs.
+
+## Decision Framework
+
+### Which Screenshot Mode?
+
+Use `screenshot --format json` when parsing automation output, or plain `screenshot` for human readable text.
+
+### How to Wait?
+
+```
+What are you waiting for?
+│
+├─► Specific text to appear
+│   └─► `wait "text" --assert` (fails if not found)
+│
+├─► Specific text to disappear
+│   └─► `wait "text" --gone --assert`
+│
+├─► UI to stop changing (animations, loading)
+│   └─► `wait --stable`
+│
+└─► Multiple conditions
+    └─► Chain waits sequentially
+```
+
+### How to Act?
+
+```
+What do you need to do?
+│
+├─► Type text into the terminal
+│   └─► `type "text"`
+│
+├─► Send keyboard shortcuts/navigation
+│   └─► `press Ctrl+C` or `press ArrowDown Enter`
+```
+
+## Core Workflow
+
+The canonical automation loop:
+
+```bash
+# 1. START: Launch the TUI app
+agent-tui run <command> [-- args...]
+
+# 2. OBSERVE: Get current UI state
+agent-tui screenshot --format json
+
+# 3. DECIDE: Based on text, determine next action
+# (This happens in your head/code)
+
+# 4. ACT: Execute the action
+agent-tui type "text"
+agent-tui press Enter
+
+# 5. WAIT: Synchronize with UI changes
+agent-tui wait "Expected" --assert    # or wait --stable
+
+# 6. VERIFY: Confirm the outcome (often combined with step 5)
+# If verification fails, handle the error
+
+# 7. REPEAT: Go back to step 2 until done
+
+# 8. CLEANUP: Always clean up
+agent-tui kill
+```
+
+## Anti-Patterns (What NOT to Do)
+
+### ❌ Acting During Animation/Loading
+
+```bash
+# WRONG: Acting immediately on dynamic UI
+agent-tui run my-app
+agent-tui screenshot --format json    # UI might still be loading!
+agent-tui type "value"                # ❌ Might miss the input field
+
+# RIGHT: Wait for stability first
+agent-tui run my-app
+agent-tui wait --stable               # Let UI settle
+agent-tui screenshot --format json    # Now it's reliable
+agent-tui type "value"
+```
+
+### ❌ Assuming Success Without Verification
+
+```bash
+# WRONG: Assuming the type worked
+agent-tui type "value"
+agent-tui press Enter
+# ...proceed as if success...       # ❌ What if it failed silently?
+
+# RIGHT: Verify the outcome
+agent-tui type "value"
+agent-tui press Enter
+agent-tui wait "Success" --assert    # ✓ Proves the action worked
+```
+
+### ❌ Skipping Cleanup
+
+```bash
+# WRONG: Forgetting to kill the session
+agent-tui run my-app
+# ...do stuff...
+# script ends                        # ❌ Session left running!
+
+# RIGHT: Always clean up
+agent-tui run my-app
+# ...do stuff...
+agent-tui kill                       # ✓ Clean exit
+```
+
+## Before You Start: Clarify Requirements
+
+Before automating any TUI, gather this information:
+
+1. **Command**: What exactly to run? (`my-app --flag` or `npm start`?)
+2. **Success criteria**: What text/state indicates success?
+3. **Input sequence**: What keystrokes/data to enter, in what order?
+4. **Safety**: Is it safe to submit forms, delete data, etc.?
+5. **Auth**: Does it need login? Test credentials?
+6. **Live preview**: Does the user want to watch? (`agent-tui live start --open`)
+
+If any of these are unclear, ask before running.
+
+## Demo Mode: Showing What agent-tui Can Do
+
+When a user asks what agent-tui is, wants a demo, or asks "show me how it works":
+
+1. **Don't explain—demonstrate.** Actions speak louder than words.
+2. **Use the live preview** so they can watch in real-time.
+3. **Run `top`**—it's universal and shows dynamic real-time updates.
+
+**Quick demo trigger phrases:**
+- "What is agent-tui?" / "What does agent-tui do?"
+- "Demo agent-tui" / "Show me agent-tui"
+- "How does agent-tui work?" / "See it in action"
+
+## Failure Recovery
+
+| Symptom | Diagnosis | Solution |
+|---------|-----------|----------|
+| "Text not found" | Stale view or text moved | Re-snapshot, locate text again |
+| Wait times out | UI didn't reach expected state | Check screenshot, verify expectations |
+| "Daemon not running" | Daemon crashed or not started | `agent-tui daemon start` |
+| Unexpected layout | Wrong terminal size | `agent-tui resize --cols 120 --rows 40` |
+| Session unresponsive | App crashed or hung | `agent-tui kill`, then re-run |
+| Repeated failures | Something fundamentally wrong | Stop after 3-5 attempts, ask user |
+
+## Self-Discovery: Use --help
+
+You don't need to memorize every flag. The CLI is self-documenting:
+
+```bash
+agent-tui --help                     # List all commands
+agent-tui run --help                 # Options for 'run'
+agent-tui screenshot --help          # Options for 'screenshot'
+agent-tui wait --help                # Options for 'wait'
+```
+
+**When in doubt, ask the CLI.** This skill teaches *when* and *why* to use commands. For exact flags and syntax, `--help` is authoritative.
+
+## Quick Reference
+
+```bash
+# Start app
+agent-tui run <cmd> [-- args]        # Launch TUI under control
+
+# Observe
+agent-tui screenshot                  # Plain text view
+agent-tui screenshot --format json    # Machine-readable output
+
+# Act
+agent-tui press Enter                 # Press key(s)
+agent-tui press Ctrl+C                # Keyboard shortcuts
+agent-tui type "text"                 # Type text
+
+# Wait/Verify
+agent-tui wait "text" --assert        # Wait for text, fail if not found
+agent-tui wait "text" --gone --assert # Wait for text to disappear
+agent-tui wait --stable               # Wait for UI to stop changing
+
+# Manage
+agent-tui sessions                    # List active sessions
+agent-tui live start --open           # Start live preview
+agent-tui kill                        # End current session
+```
@@ -0,0 +1,65 @@
+---
+name: tui-tester
+description: Expert guidance for testing Gemini CLI behavior and visual output using terminal automation.
+---
+
+# TUI Tester Skill
+
+This skill provides the operational manual for verifying Gemini CLI behavioral changes and visual output using terminal automation.
+
+## Core Responsibilities
+
+- **Verify Behavior**: Confirm that code changes result in the expected terminal interactions.
+- **Visual Validation**: Ensure the TUI renders correctly across different terminal sizes and states.
+- **Regression Testing**: Use automation to prevent breaking existing interactive workflows.
+
+## Critical Protocol
+
+When performing TUI testing, you must adhere to these strict rules:
+
+### 1. Initialization
+**YOUR ABSOLUTE FIRST ACTION MUST BE:**
+Activate the `agent-tui` skill. This provides the underlying tools needed for terminal automation.
+
+### 2. Environment Setup (macOS / Parallel Safe)
+Ensure the global daemon is running and the live preview is open:
+```bash
+if ! agent-tui sessions >/dev/null 2>&1; then
+  tmux kill-session -t agent-tui 2>/dev/null || true
+  agent-tui daemon stop 2>/dev/null || true
+  rm -f /tmp/agent-tui*
+  tmux new-session -d -s agent-tui 'agent-tui daemon start --foreground > /tmp/agent-tui-daemon.log 2>&1'
+  sleep 1
+fi
+agent-tui live start --open
+```
+
+### 3. Session Management
+- **Session IDs**: Always use the `session_id` returned by `agent-tui run` for subsequent interactions.
+- **Atomic Execution**: Execute exactly one command per turn. Do not pipeline actions.
+- **The Loop**: Action -> Wait -> Screenshot -> Verify -> Next Action.
+
+### 4. Gemini CLI Specifics
+- **Build First**: Always run `npm run build` or `npm run build:all` before testing local changes.
+- **Bypass Trust**: Set `GEMINI_CLI_TRUST_WORKSPACE=true` to avoid focus-stealing modals.
+- **Isolate Config**: Use `GEMINI_CLI_HOME` to prevent interference with your personal settings.
+
+## Workflow Example
+
+```bash
+# Start the CLI
+env GEMINI_CLI_TRUST_WORKSPACE=true agent-tui run node packages/cli/dist/index.js
+
+# Wait for the prompt
+agent-tui wait "│" --assert
+
+# Send a command
+agent-tui type "/help"
+agent-tui press Enter
+
+# Verify output
+agent-tui wait "Available Commands" --assert
+```
+
+## Error Recovery
+If a wait times out, take a fresh screenshot to diagnose the state. If you see `os error 61`, restart the daemon using the tmux method.