docs/cli/token-caching.md

# Token caching and cost optimization

Gemini CLI automatically optimizes API costs through token caching when using
API key authentication (Gemini API key or Vertex AI). This feature reuses
previous system instructions and context to reduce the number of tokens
processed in subsequent requests.

**Token caching is available for:**

- API key users (Gemini API key)
- Vertex AI users (with project and location setup)

**Token caching is not available for:**

- OAuth users (Google Personal/Enterprise accounts) - the Code Assist API does
  not support cached content creation at this time

You can view your token usage and cached token savings using the `/stats`
command. When cached tokens are available, they will be displayed in the stats
output.
Updated ToC on docs intro; updated title casing to match Google style (#13717) 2025-12-01 11:38:48 -08:00			`# Token caching and cost optimization`
Updated README.md (#1367) 2025-06-23 23:37:07 -04:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`Gemini CLI automatically optimizes API costs through token caching when using`
			`API key authentication (Gemini API key or Vertex AI). This feature reuses`
			`previous system instructions and context to reduce the number of tokens`
			`processed in subsequent requests.`
Updated README.md (#1367) 2025-06-23 23:37:07 -04:00
			`Token caching is available for:`

			`- API key users (Gemini API key)`
			`- Vertex AI users (with project and location setup)`

			`Token caching is not available for:`

cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			`- OAuth users (Google Personal/Enterprise accounts) - the Code Assist API does`
			`not support cached content creation at this time`
Updated README.md (#1367) 2025-06-23 23:37:07 -04:00
cleanup(markdown): Prettier format all markdown @ 80 char width (#10714) 2025-10-09 08:17:37 -04:00			You can view your token usage and cached token savings using the `/stats`
			`command. When cached tokens are available, they will be displayed in the stats`
			`output.`