Files
gemini-cli/docs/integration-tests.md

193 lines
5.1 KiB
Markdown
Raw Normal View History

2025-06-16 08:27:29 -07:00
# Integration Tests
This document provides information about the integration testing framework used
in this project.
2025-06-16 08:27:29 -07:00
## Overview
The integration tests are designed to validate the end-to-end functionality of
the Gemini CLI. They execute the built binary in a controlled environment and
verify that it behaves as expected when interacting with the file system.
2025-06-16 08:27:29 -07:00
These tests are located in the `integration-tests` directory and are run using a
custom test runner.
2025-06-16 08:27:29 -07:00
## Building the tests
Prior to running any integration tests, you need to create a release bundle that
you want to actually test:
```bash
npm run bundle
```
You must re-run this command after making any changes to the CLI source code,
but not after making changes to tests.
## Running the tests
2025-06-16 08:27:29 -07:00
The integration tests are not run as part of the default `npm run test` command.
They must be run explicitly using the `npm run test:integration:all` script.
2025-06-16 08:27:29 -07:00
The integration tests can also be run using the following shortcut:
2025-06-16 08:27:29 -07:00
```bash
npm run test:e2e
```
## Running a specific set of tests
2025-06-16 08:27:29 -07:00
To run a subset of test files, you can use
`npm run <integration test command> <file_name1> ....` where &lt;integration
test command&gt; is either `test:e2e` or `test:integration*` and `<file_name>`
is any of the `.test.js` files in the `integration-tests/` directory. For
example, the following command runs `list_directory.test.js` and
`write_file.test.js`:
2025-06-16 08:27:29 -07:00
```bash
npm run test:e2e list_directory write_file
2025-06-16 08:27:29 -07:00
```
### Running a single test by name
2025-06-16 08:27:29 -07:00
To run a single test by its name, use the `--test-name-pattern` flag:
```bash
npm run test:e2e -- --test-name-pattern "reads a file"
```
### Deflaking a test
Before adding a **new** integration test, you should test it at least 5 times
with the deflake script or workflow to make sure that it is not flaky.
### Deflake script
```bash
npm run deflake -- --runs=5 --command="npm run test:e2e -- -- --test-name-pattern '<your-new-test-name>'"
```
#### Deflake Workflow
```bash
gh workflow run deflake.yml --ref <your-branch> -f test_name_pattern="<your-test-name-pattern>"
```
### Running all tests
2025-06-16 08:27:29 -07:00
To run the entire suite of integration tests, use the following command:
```bash
npm run test:integration:all
```
### Sandbox matrix
2025-06-16 08:27:29 -07:00
The `all` command will run tests for `no sandboxing`, `docker` and `podman`.
Each individual type can be run using the following commands:
2025-06-16 08:27:29 -07:00
```bash
npm run test:integration:sandbox:none
2025-06-16 08:27:29 -07:00
```
```bash
npm run test:integration:sandbox:docker
```
```bash
npm run test:integration:sandbox:podman
```
## Diagnostics
The integration test runner provides several options for diagnostics to help
track down test failures.
2025-06-16 08:27:29 -07:00
### Keeping test output
2025-06-16 08:27:29 -07:00
You can preserve the temporary files created during a test run for inspection.
This is useful for debugging issues with file system operations.
2025-06-16 08:27:29 -07:00
To keep the test output set the `KEEP_OUTPUT` environment variable to `true`.
2025-06-16 08:27:29 -07:00
```bash
KEEP_OUTPUT=true npm run test:integration:sandbox:none
```
When output is kept, the test runner will print the path to the unique directory
for the test run.
2025-06-16 08:27:29 -07:00
### Verbose output
2025-06-16 08:27:29 -07:00
For more detailed debugging, set the `VERBOSE` environment variable to `true`.
2025-06-16 08:27:29 -07:00
```bash
VERBOSE=true npm run test:integration:sandbox:none
2025-06-16 08:27:29 -07:00
```
When using `VERBOSE=true` and `KEEP_OUTPUT=true` in the same command, the output
is streamed to the console and also saved to a log file within the test's
temporary directory.
2025-06-16 08:27:29 -07:00
The verbose output is formatted to clearly identify the source of the logs:
```
--- TEST: <log dir>:<test-name> ---
2025-06-16 08:27:29 -07:00
... output from the gemini command ...
--- END TEST: <log dir>:<test-name> ---
2025-06-16 08:27:29 -07:00
```
## Linting and formatting
2025-06-16 08:27:29 -07:00
To ensure code quality and consistency, the integration test files are linted as
part of the main build process. You can also manually run the linter and
auto-fixer.
2025-06-16 08:27:29 -07:00
### Running the linter
2025-06-16 08:27:29 -07:00
To check for linting errors, run the following command:
```bash
npm run lint
```
You can include the `:fix` flag in the command to automatically fix any fixable
linting errors:
2025-06-16 08:27:29 -07:00
```bash
npm run lint:fix
2025-06-16 08:27:29 -07:00
```
## Directory structure
2025-06-16 08:27:29 -07:00
The integration tests create a unique directory for each test run inside the
`.integration-tests` directory. Within this directory, a subdirectory is created
for each test file, and within that, a subdirectory is created for each
individual test case.
2025-06-16 08:27:29 -07:00
This structure makes it easy to locate the artifacts for a specific test run,
file, or case.
2025-06-16 08:27:29 -07:00
```
.integration-tests/
└── <run-id>/
└── <test-file-name>.test.js/
└── <test-case-name>/
├── output.log
└── ...other test artifacts...
```
## Continuous integration
2025-06-16 08:27:29 -07:00
To ensure the integration tests are always run, a GitHub Actions workflow is
defined in `.github/workflows/e2e.yml`. This workflow automatically runs the
integrations tests for pull requests against the `main` branch, or when a pull
request is added to a merge queue.
2025-06-16 08:27:29 -07:00
The workflow runs the tests in different sandboxing environments to ensure
Gemini CLI is tested across each:
2025-06-16 08:27:29 -07:00
- `sandbox:none`: Runs the tests without any sandboxing.
- `sandbox:docker`: Runs the tests in a Docker container.
- `sandbox:podman`: Runs the tests in a Podman container.