2025-12-01 11:38:48 -08:00
|
|
|
# Token caching and cost optimization
|
2025-06-23 23:37:07 -04:00
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
Gemini CLI automatically optimizes API costs through token caching when using
|
|
|
|
|
API key authentication (Gemini API key or Vertex AI). This feature reuses
|
|
|
|
|
previous system instructions and context to reduce the number of tokens
|
|
|
|
|
processed in subsequent requests.
|
2025-06-23 23:37:07 -04:00
|
|
|
|
|
|
|
|
**Token caching is available for:**
|
|
|
|
|
|
|
|
|
|
- API key users (Gemini API key)
|
|
|
|
|
- Vertex AI users (with project and location setup)
|
|
|
|
|
|
|
|
|
|
**Token caching is not available for:**
|
|
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
- OAuth users (Google Personal/Enterprise accounts) - the Code Assist API does
|
|
|
|
|
not support cached content creation at this time
|
2025-06-23 23:37:07 -04:00
|
|
|
|
2025-10-09 08:17:37 -04:00
|
|
|
You can view your token usage and cached token savings using the `/stats`
|
|
|
|
|
command. When cached tokens are available, they will be displayed in the stats
|
|
|
|
|
output.
|