Skip to content

Commit

Permalink
Enable caching for LLM requests with configurable cache names
Browse files Browse the repository at this point in the history
  • Loading branch information
pelikhan committed Aug 29, 2024
1 parent d118049 commit cda4475
Show file tree
Hide file tree
Showing 20 changed files with 69 additions and 92 deletions.
6 changes: 3 additions & 3 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

37 changes: 14 additions & 23 deletions docs/src/content/docs/reference/scripts/cache.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,20 @@ keywords: cache management, LLM request caching, script performance, cache file

import { FileTree } from "@astrojs/starlight/components"

Check warning on line 10 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The statement about LLM requests caching has been changed. It was previously stated that LLM requests are cached by default, but the updated content states that they are not cached by default. This is a significant change and should be verified for accuracy.
LLM requests are cached by default. This means that if a script generates the same prompt for the same model, the cache may be used.
LLM requests are **NOT** cached by default. However, you can turn on LLM request caching from `script` metadata or the CLI arguments.

- the `temperature` is less than 0.5
- the `top_p` is less than 0.5
- no [functions](./functions.md) are used as they introduce randomness
- `seed` is not used
```js "cache: true"
script({
...,
cache: true
})
```

or

```sh "--cache"
npx genaiscript run ... --cache
```

Check warning on line 24 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

New content has been added to explain how to enable LLM request caching. This includes a JavaScript code snippet and a shell command. Ensure that these instructions are correct and clear for users.

The cache is stored in the `.genaiscript/cache/chat.jsonl` file. You can delete this file to clear the cache.
This file is excluded from git by default.
Expand All @@ -26,23 +34,6 @@ This file is excluded from git by default.

</FileTree>

Check failure on line 36 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The section on disabling the cache has been removed. If this information is still relevant and useful, consider adding it back to the documentation.
## Disabling

You can always disable the cache using the `cache` option in `script`.

```js
script({
...,
cache: false // always off
})
```

Or using the `--no-cache` flag in the CLI.

```sh
npx genaiscript run .... --no-cache
```

## Custom cache file

Use the `cacheName` option to specify a custom cache file name.
Expand All @@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
```js
script({
...,

Check warning on line 44 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The property name in the JavaScript code snippet has been changed from 'cacheName' to 'cache'. This could potentially confuse users if not properly explained in the surrounding text.
cacheName: "summary"
cache: "summary"
})
```

Expand Down
6 changes: 3 additions & 3 deletions genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 1 addition & 3 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ export const CHANGE = "change"
export const TRACE_CHUNK = "traceChunk"
export const RECONNECT = "reconnect"
export const OPEN = "open"
export const MAX_CACHED_TEMPERATURE = 0.5
export const MAX_CACHED_TOP_P = 0.5
export const MAX_TOOL_CALLS = 10000

// https://learn.microsoft.com/en-us/azure/ai-services/openai/reference
Expand Down Expand Up @@ -211,7 +209,7 @@ export const GITHUB_API_VERSION = "2022-11-28"
export const GITHUB_TOKEN = "GITHUB_TOKEN"

export const AI_REQUESTS_CACHE = "airequests"
export const CHAT_CACHE = "chatv2"
export const CHAT_CACHE = "chat"
export const GITHUB_PULL_REQUEST_REVIEWS_CACHE = "prr"
export const GITHUB_PULLREQUEST_REVIEW_COMMENT_LINE_DISTANCE = 5

Expand Down
6 changes: 3 additions & 3 deletions packages/core/src/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 8 additions & 19 deletions packages/core/src/openai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ import { normalizeInt, trimTrailingSlash } from "./util"
import { LanguageModelConfiguration, host } from "./host"
import {
AZURE_OPENAI_API_VERSION,
MAX_CACHED_TEMPERATURE,
MAX_CACHED_TOP_P,
MODEL_PROVIDER_OPENAI,
TOOL_ID,
} from "./constants"
Expand Down Expand Up @@ -50,13 +48,10 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
options,
trace
) => {
const { temperature, top_p, seed, tools } = req
const {
requestOptions,
partialCb,
maxCachedTemperature = MAX_CACHED_TEMPERATURE,
maxCachedTopP = MAX_CACHED_TOP_P,
cache: useCache,
cache: cacheOrName,
cacheName,
retry,
retryDelay,
Expand All @@ -69,18 +64,12 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
const { model } = parseModelIdentifier(req.model)
const encoder = await resolveTokenEncoder(model)

const cache = getChatCompletionCache(cacheName)
const caching =
useCache === true || // always use cache
(useCache !== false && // never use cache
seed === undefined && // seed is not cacheable (let the LLM make the run deterministic)
!tools?.length && // assume tools are non-deterministic by default
(isNaN(temperature) ||
isNaN(maxCachedTemperature) ||
temperature < maxCachedTemperature) && // high temperature is not cacheable (it's too random)
(isNaN(top_p) || isNaN(maxCachedTopP) || top_p < maxCachedTopP))
trace.itemValue(`caching`, caching)
const cachedKey = caching
const cache = getChatCompletionCache(
typeof cacheOrName === "string" ? cacheOrName : cacheName
)
trace.itemValue(`caching`, !!cache)
trace.itemValue(`cache`, cache?.name)
const cachedKey = !!cacheOrName
? <ChatCompletionRequestCacheKey>{
...req,
...cfgNoToken,
Expand Down Expand Up @@ -263,7 +252,7 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
responseSoFar: chatResp,
tokensSoFar: numTokens,
responseChunk: progress,
inner
inner,
})
}
pref = chunk
Expand Down
6 changes: 3 additions & 3 deletions packages/core/src/types/prompt_template.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,13 @@ interface ModelOptions extends ModelConnectionOptions {
seed?: number

/**
* If true, the prompt will be cached. If false, the LLM chat is never cached.
* Leave empty to use the default behavior.
* By default, LLM queries are not cached. If true, the LLM request will be cached. Use a string to override the default cache name
*/
cache?: boolean
cache?: boolean | string

/**
* Custom cache name. If not set, the default cache is used.
* @deprecated Use `cache` instead with a string
*/
cacheName?: string
}
Expand Down
3 changes: 1 addition & 2 deletions packages/sample/genaisrc/cache.genai.mts
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
script({
model: "openai:gpt-3.5-turbo",
cache: true,
cacheName: "gpt-cache",
cache: "gpt-cache",
tests: [{}, {}], // run twice to trigger caching
})

Expand Down
6 changes: 3 additions & 3 deletions packages/sample/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/node/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/python/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/style/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion packages/sample/genaisrc/summary-of-summary-gpt35.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ for (const file of env.files) {
_.def("FILE", file)
_.$`Summarize FILE. Be concise.`
},
{ model: "gpt-3.5-turbo", cacheName: "summary_gpt35" }
{ model: "gpt-3.5-turbo", cache: "summary_gpt35" }
)
// save the summary in the main prompt
def("FILE", { filename: file.filename, content: text })
Expand Down
4 changes: 2 additions & 2 deletions packages/sample/genaisrc/summary-of-summary-phi3.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ script({
tests: {
files: ["src/rag/*.md"],
keywords: ["markdown", "lorem", "microsoft"],
}
},
})

// summarize each files individually
Expand All @@ -15,7 +15,7 @@ for (const file of env.files) {
_.def("FILE", file)
_.$`Extract keywords for the contents of FILE.`
},
{ model: "ollama:phi3", cacheName: "summary_phi3" }
{ model: "ollama:phi3", cache: "summary_phi3" }
)
def("FILE", { ...file, content: text })
}
Expand Down
6 changes: 3 additions & 3 deletions packages/sample/src/aici/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/errors/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/makecode/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/tla/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/vision/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit cda4475

Please sign in to comment.