docs/integration-tests.md

# Integration Tests

This document provides information about the integration testing framework used
in this project.

## Overview

The integration tests are designed to validate the end-to-end functionality of
the Gemini CLI. They execute the built binary in a controlled environment and
verify that it behaves as expected when interacting with the file system.

These tests are located in the `integration-tests` directory and are run using a
custom test runner.

## Building the tests

Prior to running any integration tests, you need to create a release bundle that
you want to actually test:

```bash
npm run bundle
```

You must re-run this command after making any changes to the CLI source code,
but not after making changes to tests.

## Running the tests

The integration tests are not run as part of the default `npm run test` command.
They must be run explicitly using the `npm run test:integration:all` script.

The integration tests can also be run using the following shortcut:

```bash
npm run test:e2e
```

## Running a specific set of tests

To run a subset of test files, you can use
`npm run <integration test command> <file_name1> ....` where &lt;integration
test command&gt; is either `test:e2e` or `test:integration*` and `<file_name>`
is any of the `.test.js` files in the `integration-tests/` directory. For
example, the following command runs `list_directory.test.js` and
`write_file.test.js`:

```bash
npm run test:e2e list_directory write_file
```

### Running a single test by name

To run a single test by its name, use the `--test-name-pattern` flag:

```bash
npm run test:e2e -- --test-name-pattern "reads a file"
```

### Deflaking a test

Before adding a **new** integration test, you should test it at least 5 times
with the deflake script or workflow to make sure that it is not flaky.

### Deflake script

```bash
npm run deflake -- --runs=5 --command="npm run test:e2e -- -- --test-name-pattern '<your-new-test-name>'"
```

#### Deflake Workflow

```bash
gh workflow run deflake.yml --ref <your-branch> -f test_name_pattern="<your-test-name-pattern>"
```

### Running all tests

To run the entire suite of integration tests, use the following command:

```bash
npm run test:integration:all
```

### Sandbox matrix

The `all` command will run tests for `no sandboxing`, `docker` and `podman`.
Each individual type can be run using the following commands:

```bash
npm run test:integration:sandbox:none
```

```bash
npm run test:integration:sandbox:docker
```

```bash
npm run test:integration:sandbox:podman
```

## Diagnostics

The integration test runner provides several options for diagnostics to help
track down test failures.

### Keeping test output

You can preserve the temporary files created during a test run for inspection.
This is useful for debugging issues with file system operations.

To keep the test output set the `KEEP_OUTPUT` environment variable to `true`.

```bash
KEEP_OUTPUT=true npm run test:integration:sandbox:none
```

When output is kept, the test runner will print the path to the unique directory
for the test run.

### Verbose output

For more detailed debugging, set the `VERBOSE` environment variable to `true`.

```bash
VERBOSE=true npm run test:integration:sandbox:none
```

When using `VERBOSE=true` and `KEEP_OUTPUT=true` in the same command, the output
is streamed to the console and also saved to a log file within the test's
temporary directory.

The verbose output is formatted to clearly identify the source of the logs:

```
--- TEST: <log dir>:<test-name> ---
... output from the gemini command ...
--- END TEST: <log dir>:<test-name> ---
```

## Linting and formatting

To ensure code quality and consistency, the integration test files are linted as
part of the main build process. You can also manually run the linter and
auto-fixer.

### Running the linter

To check for linting errors, run the following command:

```bash
npm run lint
```

You can include the `:fix` flag in the command to automatically fix any fixable
linting errors:

```bash
npm run lint:fix
```

## Directory structure

The integration tests create a unique directory for each test run inside the
`.integration-tests` directory. Within this directory, a subdirectory is created
for each test file, and within that, a subdirectory is created for each
individual test case.

This structure makes it easy to locate the artifacts for a specific test run,
file, or case.

```
.integration-tests/
└── <run-id>/
    └── <test-file-name>.test.js/
        └── <test-case-name>/
            ├── output.log
            └── ...other test artifacts...
```

## Continuous integration

To ensure the integration tests are always run, a GitHub Actions workflow is
defined in `.github/workflows/e2e.yml`. This workflow automatically runs the
integrations tests for pull requests against the `main` branch, or when a pull
request is added to a merge queue.

The workflow runs the tests in different sandboxing environments to ensure
Gemini CLI is tested across each:

- `sandbox:none`: Runs the tests without any sandboxing.
- `sandbox:docker`: Runs the tests in a Docker container.
- `sandbox:podman`: Runs the tests in a Podman container.
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			`# Integration Tests`

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`This document provides information about the integration testing framework used`
			`in this project.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			`## Overview`

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`The integration tests are designed to validate the end-to-end functionality of`
			`the Gemini CLI. They execute the built binary in a controlled environment and`
			`verify that it behaves as expected when interacting with the file system.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			These tests are located in the `integration-tests` directory and are run using a
			`custom test runner.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
add bundle command info to integration test docs (#11034) 2025-10-13 09:43:52 -07:00			`## Building the tests`

			`Prior to running any integration tests, you need to create a release bundle that`
			`you want to actually test:`

			```bash
			`npm run bundle`
			```

			`You must re-run this command after making any changes to the CLI source code,`
			`but not after making changes to tests.`

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`## Running the tests`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			The integration tests are not run as part of the default `npm run test` command.
			They must be run explicitly using the `npm run test:integration:all` script.
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`The integration tests can also be run using the following shortcut:`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
			`npm run test:e2e`
			```

docs: fix typos in documentation (#1411) Co-authored-by: Dan Tedesco <dted@google.com> 2025-06-25 03:53:03 +00:00			`## Running a specific set of tests`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`To run a subset of test files, you can use`
			`npm run <integration test command> <file_name1> ....` where <integration
			test command> is either `test:e2e` or `test:integration*` and `<file_name>`
			is any of the `.test.js` files in the `integration-tests/` directory. For
			example, the following command runs `list_directory.test.js` and
			`write_file.test.js`:
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`npm run test:e2e list_directory write_file`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Running a single test by name`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			To run a single test by its name, use the `--test-name-pattern` flag:

			```bash
			`npm run test:e2e -- --test-name-pattern "reads a file"`
			```

Fix(doc) - Add section in docs for deflaking (#10750) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-08 17:21:02 -04:00			`### Deflaking a test`

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`Before adding a new integration test, you should test it at least 5 times`
feat(infra) - Create a workflow for deflake (#11535) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-22 14:41:26 -04:00			`with the deflake script or workflow to make sure that it is not flaky.`

			`### Deflake script`
Fix(doc) - Add section in docs for deflaking (#10750) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-08 17:21:02 -04:00
			```bash
fix(doc) - Update doc for deflake command (#10829) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-09 15:06:35 -04:00			`npm run deflake -- --runs=5 --command="npm run test:e2e -- -- --test-name-pattern '<your-new-test-name>'"`
Fix(doc) - Add section in docs for deflaking (#10750) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-08 17:21:02 -04:00			```

feat(infra) - Create a workflow for deflake (#11535) Co-authored-by: gemini-cli-robot <gemini-cli-robot@google.com> 2025-10-22 14:41:26 -04:00			`#### Deflake Workflow`

			```bash
			`gh workflow run deflake.yml --ref <your-branch> -f test_name_pattern="<your-test-name-pattern>"`
			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Running all tests`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			`To run the entire suite of integration tests, use the following command:`

			```bash
			`npm run test:integration:all`
			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Sandbox matrix`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			The `all` command will run tests for `no sandboxing`, `docker` and `podman`.
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`Each individual type can be run using the following commands:`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`npm run test:integration:sandbox:none`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			```

			```bash
			`npm run test:integration:sandbox:docker`
			```

			```bash
			`npm run test:integration:sandbox:podman`
			```

			`## Diagnostics`

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`The integration test runner provides several options for diagnostics to help`
			`track down test failures.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Keeping test output`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`You can preserve the temporary files created during a test run for inspection.`
			`This is useful for debugging issues with file system operations.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Upgrade integration tests to use Vitest (#6021) 2025-08-12 15:57:27 -07:00			To keep the test output set the `KEEP_OUTPUT` environment variable to `true`.
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
			`KEEP_OUTPUT=true npm run test:integration:sandbox:none`
			```

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`When output is kept, the test runner will print the path to the unique directory`
			`for the test run.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Verbose output`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Upgrade integration tests to use Vitest (#6021) 2025-08-12 15:57:27 -07:00			For more detailed debugging, set the `VERBOSE` environment variable to `true`.
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
Upgrade integration tests to use Vitest (#6021) 2025-08-12 15:57:27 -07:00			`VERBOSE=true npm run test:integration:sandbox:none`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			```

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			When using `VERBOSE=true` and `KEEP_OUTPUT=true` in the same command, the output
			`is streamed to the console and also saved to a log file within the test's`
			`temporary directory.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			`The verbose output is formatted to clearly identify the source of the logs:`

			```
Upgrade integration tests to use Vitest (#6021) 2025-08-12 15:57:27 -07:00			`--- TEST: <log dir>:<test-name> ---`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			`... output from the gemini command ...`
Upgrade integration tests to use Vitest (#6021) 2025-08-12 15:57:27 -07:00			`--- END TEST: <log dir>:<test-name> ---`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`## Linting and formatting`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`To ensure code quality and consistency, the integration test files are linted as`
			`part of the main build process. You can also manually run the linter and`
			`auto-fixer.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`### Running the linter`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			`To check for linting errors, run the following command:`

			```bash
			`npm run lint`
			```

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			You can include the `:fix` flag in the command to automatically fix any fixable
			`linting errors:`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```bash
doc(lint): fix docs on how to run linter in "fix" mode (#5647) 2025-08-05 23:21:36 -04:00			`npm run lint:fix`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`## Directory structure`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`The integration tests create a unique directory for each test run inside the`
			`.integration-tests` directory. Within this directory, a subdirectory is created
			`for each test file, and within that, a subdirectory is created for each`
			`individual test case.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`This structure makes it easy to locate the artifacts for a specific test run,`
			`file, or case.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			```
			`.integration-tests/`
			`└── <run-id>/`
			`└── <test-file-name>.test.js/`
			`└── <test-case-name>/`
			`├── output.log`
			`└── ...other test artifacts...`
			```

Edit pass of docs/integration-tests.md (#1198) Co-authored-by: cperry-goog <78765543+cperry-goog@users.noreply.github.com> Co-authored-by: Chris Perry <cperry@google.com> 2025-06-20 10:27:00 -07:00			`## Continuous integration`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`To ensure the integration tests are always run, a GitHub Actions workflow is`
			defined in `.github/workflows/e2e.yml`. This workflow automatically runs the
			integrations tests for pull requests against the `main` branch, or when a pull
			`request is added to a merge queue.`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`The workflow runs the tests in different sandboxing environments to ensure`
			`Gemini CLI is tested across each:`
Preflight and integration npx (#1096) 2025-06-16 08:27:29 -07:00
			- `sandbox:none`: Runs the tests without any sandboxing.
			- `sandbox:docker`: Runs the tests in a Docker container.
			- `sandbox:podman`: Runs the tests in a Podman container.