feat(skills): add agent-tui and tui-tester skills (#27121)

This commit is contained in:
Adam Weidman
2026-05-18 11:46:02 -04:00
committed by GitHub
parent 84423e6ea1
commit 9d01958cdb
2 changed files with 395 additions and 0 deletions
+330
View File
@@ -0,0 +1,330 @@
---
name: agent-tui
description: >
Main Agents: Do NOT use this skill directly. If you need to test the TUI, invoke the `tui_tester` subagent.
Drive terminal UI (TUI) applications programmatically for testing, automation, and inspection.
Use when: automating CLI/TUI interactions, regression testing terminal apps, or verifying interactive behavior.
Also use when: user asks "what is agent-tui", "what does agent-tui do", "demo agent-tui", "show me agent-tui", "how does agent-tui work", or wants to see it in action.
---
## 🚨 CRITICAL: macOS Daemon Workaround & Gemini CLI Usage 🚨
When using `agent-tui` in this macOS environment, the default background daemonization process crashes, causing `Connection refused (os error 61)` errors.
**You MUST start the daemon manually shielded from TTY hangups before running any `agent-tui` commands.** Using `nohup` is insufficient; you must use `tmux` to provide a fully isolated pseudo-terminal.
To support parallel runs, **only restart the daemon if it is not currently running:**
```bash
# Check if daemon is alive, start it in tmux if it is not
if ! agent-tui sessions >/dev/null 2>&1; then
tmux kill-session -t agent-tui 2>/dev/null || true
agent-tui daemon stop 2>/dev/null || true
rm -f /tmp/agent-tui*
tmux new-session -d -s agent-tui 'agent-tui daemon start --foreground > /tmp/agent-tui-daemon.log 2>&1'
sleep 1
fi
```
### Session ID vs PID (Crucial for Reconnection)
When `agent-tui run` returns JSON, it includes both a `session_id` and a `pid`. The `pid` is purely informational (the OS process ID of the child command). You **do not** use the `pid` to reconnect or issue commands. You must always use the `session_id` (e.g., `--session <id>`).
If the daemon crashes (`os error 61`), the pseudo-terminal is destroyed. Even if the child `pid` survives as an orphan, you cannot reconnect to it. You must restart the daemon using the workaround above and start a completely new session.
### Testing the Gemini CLI
When testing the Gemini CLI with `agent-tui`, there are several strict requirements to ensure deterministic and accurate behavior:
1. **Build Before Running**: `agent-tui` runs the built JS files, not TypeScript. You **MUST** run `npm run build` or `npm run build:all` after making code changes and before launching the CLI with `agent-tui`.
2. **Bypass Trust Modals**: Always pass `GEMINI_CLI_TRUST_WORKSPACE=true` in the environment. If you don't, any new project-level agents or extensions will trigger a full-screen "Acknowledge and Enable" modal. This modal steals focus, swallows automation keystrokes, and causes `agent-tui wait` commands to time out.
3. **Isolated Environments**: If you need to test without real user credentials or existing agents interfering, isolate the global settings using `GEMINI_CLI_HOME=<some-test-dir>`.
4. **Testing State Deltas (e.g., Reloads)**: If you are testing features that report deltas (e.g., `/agents reload` outputting "1 new local subagent"), you **MUST**:
- Start the CLI *first* so it establishes its baseline registry.
- Use a separate shell command (outside of `agent-tui`) to write the new agent `.md`/`.toml` file.
- Use `agent-tui type` and `press` to trigger the `/agents reload` command inside the running session.
- (If you add the files before starting the CLI, they become part of the baseline and won't trigger the delta logic).
```bash
# Example: Standard isolated run (sandboxed config + bypass trust modals)
env GEMINI_CLI_TRUST_WORKSPACE=true GEMINI_CLI_HOME=test-gemini-home agent-tui run -d "$(pwd)" node packages/cli/dist/index.js
```
# Terminal Automation Mastery
## Prerequisites
- **Supported OS**: macOS or Linux (Windows not supported yet).
- **Verify install**:
```bash
agent-tui --version
```
If not installed, use one of:
```bash
# Recommended: one-line install (macOS/Linux)
curl -fsSL https://raw.githubusercontent.com/pproenca/agent-tui/master/install.sh | sh
```
```bash
# Package manager
npm i -g agent-tui
pnpm add -g agent-tui
bun add -g agent-tui
```
```bash
# Build from source
cargo install --git https://github.com/pproenca/agent-tui.git --path cli/crates/agent-tui
```
If you used the install script, ensure `~/.local/bin` is on your PATH.
## Philosophy: Why Terminal Automation Is Different
Terminal UIs are **stateless from the observer's perspective**. Unlike web browsers with a persistent DOM, terminal automation works with a constantly-refreshed character grid. This fundamental difference shapes everything:
| Web Automation | Terminal Automation |
|----------------|---------------------|
| DOM persists across interactions | Screen buffer is redrawn constantly |
| Selectors are stable | Text positions may shift |
| Query once, act many times | Must re-verify before EVERY action |
| Network events signal completion | Must detect visual stability |
**The Core Insight**: agent-tui gives you vision without memory. Each screenshot is a fresh observation. Previous state means nothing after the UI changes. This isn't a limitation—it's the nature of terminal interaction.
## Mental Model: The Feedback Loop
Think of terminal automation as a **closed-loop control system**:
```
┌──────────────────────────────────────────────┐
│ │
▼ │
OBSERVE ──► DECIDE ──► ACT ──► WAIT ──► VERIFY ───┘
│ │
│ │
└─────── NEVER skip ◄────────────────────┘
```
**Each phase is mandatory.** Skipping verification is the #1 cause of flaky automation.
### The "Fresh Eyes" Principle
Every time you need to interact with the UI:
1. **Take a fresh screenshot** — your previous one is now stale
2. **Locate your target visually** — text positions may have changed
3. **Verify the state** — the UI may have changed unexpectedly
4. **Act only when stable** — animations and loading states cause failures
This feels slower, but it's the only reliable approach. Optimistic reuse of stale state causes intermittent failures that are painful to debug.
## Critical Rules (Non-Negotiable)
> **RULE 1: Atomic Execution (No Pipelining)**
> You are FORBIDDEN from chaining commands with `&&` (e.g., `type "x" && press Enter && wait`). Modals or UI updates can intercept your keystrokes. You MUST execute one atomic action, wait, screenshot, and verify before taking the next action in a new turn.
> **RULE 2: Re-snapshot after EVERY action**
> The UI state is invalidated by any change. Always take a fresh screenshot before acting again.
> **RULE 3: Never act on unstable UI**
> If the UI is animating, loading, or transitioning, `wait --stable` first. Acting during transitions because race conditions.
> **RULE 4: Verify before claiming success**
> Use `wait "expected text" --assert` to confirm outcomes. Don't assume an action worked—prove it.
> **RULE 5: Error Recovery**
> If a `wait` command times out, DO NOT blindly restart or kill the session. Execute `screenshot` to visually diagnose what unexpected UI element (modal, error dialog, lost focus) intercepted the flow.
> **RULE 6: Clean up sessions**
> Always end with `agent-tui kill`. Orphaned sessions consume resources and can interfere with future runs.
## Decision Framework
### Which Screenshot Mode?
Use `screenshot --format json` when parsing automation output, or plain `screenshot` for human readable text.
### How to Wait?
```
What are you waiting for?
├─► Specific text to appear
│ └─► `wait "text" --assert` (fails if not found)
├─► Specific text to disappear
│ └─► `wait "text" --gone --assert`
├─► UI to stop changing (animations, loading)
│ └─► `wait --stable`
└─► Multiple conditions
└─► Chain waits sequentially
```
### How to Act?
```
What do you need to do?
├─► Type text into the terminal
│ └─► `type "text"`
├─► Send keyboard shortcuts/navigation
│ └─► `press Ctrl+C` or `press ArrowDown Enter`
```
## Core Workflow
The canonical automation loop:
```bash
# 1. START: Launch the TUI app
agent-tui run <command> [-- args...]
# 2. OBSERVE: Get current UI state
agent-tui screenshot --format json
# 3. DECIDE: Based on text, determine next action
# (This happens in your head/code)
# 4. ACT: Execute the action
agent-tui type "text"
agent-tui press Enter
# 5. WAIT: Synchronize with UI changes
agent-tui wait "Expected" --assert # or wait --stable
# 6. VERIFY: Confirm the outcome (often combined with step 5)
# If verification fails, handle the error
# 7. REPEAT: Go back to step 2 until done
# 8. CLEANUP: Always clean up
agent-tui kill
```
## Anti-Patterns (What NOT to Do)
### ❌ Acting During Animation/Loading
```bash
# WRONG: Acting immediately on dynamic UI
agent-tui run my-app
agent-tui screenshot --format json # UI might still be loading!
agent-tui type "value" # ❌ Might miss the input field
# RIGHT: Wait for stability first
agent-tui run my-app
agent-tui wait --stable # Let UI settle
agent-tui screenshot --format json # Now it's reliable
agent-tui type "value"
```
### ❌ Assuming Success Without Verification
```bash
# WRONG: Assuming the type worked
agent-tui type "value"
agent-tui press Enter
# ...proceed as if success... # ❌ What if it failed silently?
# RIGHT: Verify the outcome
agent-tui type "value"
agent-tui press Enter
agent-tui wait "Success" --assert # ✓ Proves the action worked
```
### ❌ Skipping Cleanup
```bash
# WRONG: Forgetting to kill the session
agent-tui run my-app
# ...do stuff...
# script ends # ❌ Session left running!
# RIGHT: Always clean up
agent-tui run my-app
# ...do stuff...
agent-tui kill # ✓ Clean exit
```
## Before You Start: Clarify Requirements
Before automating any TUI, gather this information:
1. **Command**: What exactly to run? (`my-app --flag` or `npm start`?)
2. **Success criteria**: What text/state indicates success?
3. **Input sequence**: What keystrokes/data to enter, in what order?
4. **Safety**: Is it safe to submit forms, delete data, etc.?
5. **Auth**: Does it need login? Test credentials?
6. **Live preview**: Does the user want to watch? (`agent-tui live start --open`)
If any of these are unclear, ask before running.
## Demo Mode: Showing What agent-tui Can Do
When a user asks what agent-tui is, wants a demo, or asks "show me how it works":
1. **Don't explain—demonstrate.** Actions speak louder than words.
2. **Use the live preview** so they can watch in real-time.
3. **Run `top`**—it's universal and shows dynamic real-time updates.
**Quick demo trigger phrases:**
- "What is agent-tui?" / "What does agent-tui do?"
- "Demo agent-tui" / "Show me agent-tui"
- "How does agent-tui work?" / "See it in action"
## Failure Recovery
| Symptom | Diagnosis | Solution |
|---------|-----------|----------|
| "Text not found" | Stale view or text moved | Re-snapshot, locate text again |
| Wait times out | UI didn't reach expected state | Check screenshot, verify expectations |
| "Daemon not running" | Daemon crashed or not started | `agent-tui daemon start` |
| Unexpected layout | Wrong terminal size | `agent-tui resize --cols 120 --rows 40` |
| Session unresponsive | App crashed or hung | `agent-tui kill`, then re-run |
| Repeated failures | Something fundamentally wrong | Stop after 3-5 attempts, ask user |
## Self-Discovery: Use --help
You don't need to memorize every flag. The CLI is self-documenting:
```bash
agent-tui --help # List all commands
agent-tui run --help # Options for 'run'
agent-tui screenshot --help # Options for 'screenshot'
agent-tui wait --help # Options for 'wait'
```
**When in doubt, ask the CLI.** This skill teaches *when* and *why* to use commands. For exact flags and syntax, `--help` is authoritative.
## Quick Reference
```bash
# Start app
agent-tui run <cmd> [-- args] # Launch TUI under control
# Observe
agent-tui screenshot # Plain text view
agent-tui screenshot --format json # Machine-readable output
# Act
agent-tui press Enter # Press key(s)
agent-tui press Ctrl+C # Keyboard shortcuts
agent-tui type "text" # Type text
# Wait/Verify
agent-tui wait "text" --assert # Wait for text, fail if not found
agent-tui wait "text" --gone --assert # Wait for text to disappear
agent-tui wait --stable # Wait for UI to stop changing
# Manage
agent-tui sessions # List active sessions
agent-tui live start --open # Start live preview
agent-tui kill # End current session
```
+65
View File
@@ -0,0 +1,65 @@
---
name: tui-tester
description: Expert guidance for testing Gemini CLI behavior and visual output using terminal automation.
---
# TUI Tester Skill
This skill provides the operational manual for verifying Gemini CLI behavioral changes and visual output using terminal automation.
## Core Responsibilities
- **Verify Behavior**: Confirm that code changes result in the expected terminal interactions.
- **Visual Validation**: Ensure the TUI renders correctly across different terminal sizes and states.
- **Regression Testing**: Use automation to prevent breaking existing interactive workflows.
## Critical Protocol
When performing TUI testing, you must adhere to these strict rules:
### 1. Initialization
**YOUR ABSOLUTE FIRST ACTION MUST BE:**
Activate the `agent-tui` skill. This provides the underlying tools needed for terminal automation.
### 2. Environment Setup (macOS / Parallel Safe)
Ensure the global daemon is running and the live preview is open:
```bash
if ! agent-tui sessions >/dev/null 2>&1; then
tmux kill-session -t agent-tui 2>/dev/null || true
agent-tui daemon stop 2>/dev/null || true
rm -f /tmp/agent-tui*
tmux new-session -d -s agent-tui 'agent-tui daemon start --foreground > /tmp/agent-tui-daemon.log 2>&1'
sleep 1
fi
agent-tui live start --open
```
### 3. Session Management
- **Session IDs**: Always use the `session_id` returned by `agent-tui run` for subsequent interactions.
- **Atomic Execution**: Execute exactly one command per turn. Do not pipeline actions.
- **The Loop**: Action -> Wait -> Screenshot -> Verify -> Next Action.
### 4. Gemini CLI Specifics
- **Build First**: Always run `npm run build` or `npm run build:all` before testing local changes.
- **Bypass Trust**: Set `GEMINI_CLI_TRUST_WORKSPACE=true` to avoid focus-stealing modals.
- **Isolate Config**: Use `GEMINI_CLI_HOME` to prevent interference with your personal settings.
## Workflow Example
```bash
# Start the CLI
env GEMINI_CLI_TRUST_WORKSPACE=true agent-tui run node packages/cli/dist/index.js
# Wait for the prompt
agent-tui wait "│" --assert
# Send a command
agent-tui type "/help"
agent-tui press Enter
# Verify output
agent-tui wait "Available Commands" --assert
```
## Error Recovery
If a wait times out, take a fresh screenshot to diagnose the state. If you see `os error 61`, restart the daemon using the tmux method.