Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

++Enable caching for LLM requests with configurable cache names #677

Merged
merged 3 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

37 changes: 14 additions & 23 deletions docs/src/content/docs/reference/scripts/cache.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,21 @@
---

import { FileTree } from "@astrojs/starlight/components"

Check warning on line 10 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The statement about LLM requests caching has been changed. It was previously stated that LLM requests are cached by default, but the updated content states that they are not cached by default. This is a significant change and should be verified for accuracy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement about LLM requests caching has been changed. It was previously stated that LLM requests are cached by default, but the updated content states that they are not cached by default. This is a significant change and should be verified for accuracy.

generated by pr-docs-review-commit content_change

LLM requests are cached by default. This means that if a script generates the same prompt for the same model, the cache may be used.
LLM requests are **NOT** cached by default. However, you can turn on LLM request caching from `script` metadata or the CLI arguments.

- the `temperature` is less than 0.5
- the `top_p` is less than 0.5
- no [functions](./functions.md) are used as they introduce randomness
- `seed` is not used
```js "cache: true"
script({
...,
cache: true
})
```

or

```sh "--cache"
npx genaiscript run ... --cache
```

Check warning on line 24 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

New content has been added to explain how to enable LLM request caching. This includes a JavaScript code snippet and a shell command. Ensure that these instructions are correct and clear for users.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New content has been added to explain how to enable LLM request caching. This includes a JavaScript code snippet and a shell command. Ensure that these instructions are correct and clear for users.

generated by pr-docs-review-commit content_addition

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New code examples have been added to illustrate how to enable LLM request caching. Ensure these examples are correct and clear to the reader.

generated by pr-docs-review-commit code_example_added


The cache is stored in the `.genaiscript/cache/chat.jsonl` file. You can delete this file to clear the cache.
This file is excluded from git by default.
Expand All @@ -25,24 +33,7 @@
- chat.jsonl

</FileTree>

Check failure on line 36 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The section on disabling the cache has been removed. If this information is still relevant and useful, consider adding it back to the documentation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. If this information is still relevant and useful, consider adding it back to the documentation.

generated by pr-docs-review-commit content_removal

## Disabling

You can always disable the cache using the `cache` option in `script`.

```js
script({
...,
cache: false // always off
})
```

Or using the `--no-cache` flag in the CLI.

```sh
npx genaiscript run .... --no-cache
```

## Custom cache file

Use the `cacheName` option to specify a custom cache file name.
Expand All @@ -50,8 +41,8 @@

```js
script({
...,

Check warning on line 44 in docs/src/content/docs/reference/scripts/cache.mdx

View workflow job for this annotation

GitHub Actions / build

The property name in the JavaScript code snippet has been changed from 'cacheName' to 'cache'. This could potentially confuse users if not properly explained in the surrounding text.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property name in the JavaScript code snippet has been changed from 'cacheName' to 'cache'. This could potentially confuse users if not properly explained in the surrounding text.

generated by pr-docs-review-commit content_change

cacheName: "summary"
cache: "summary"
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. If this information is still relevant, it should be included in the documentation.

generated by pr-docs-review-commit content_removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. This information might be important for users who want to disable caching. Consider adding it back or providing an alternative way to disable caching.

generated by pr-docs-review-commit content_removal

```

Expand Down
6 changes: 3 additions & 3 deletions genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 1 addition & 3 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ export const CHANGE = "change"
export const TRACE_CHUNK = "traceChunk"
export const RECONNECT = "reconnect"
export const OPEN = "open"
export const MAX_CACHED_TEMPERATURE = 0.5
export const MAX_CACHED_TOP_P = 0.5
export const MAX_TOOL_CALLS = 10000

// https://learn.microsoft.com/en-us/azure/ai-services/openai/reference
Expand Down Expand Up @@ -211,7 +209,7 @@ export const GITHUB_API_VERSION = "2022-11-28"
export const GITHUB_TOKEN = "GITHUB_TOKEN"

export const AI_REQUESTS_CACHE = "airequests"
export const CHAT_CACHE = "chatv2"
export const CHAT_CACHE = "chat"
export const GITHUB_PULL_REQUEST_REVIEWS_CACHE = "prr"
export const GITHUB_PULLREQUEST_REVIEW_COMMENT_LINE_DISTANCE = 5

Expand Down
6 changes: 3 additions & 3 deletions packages/core/src/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 8 additions & 19 deletions packages/core/src/openai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ import { normalizeInt, trimTrailingSlash } from "./util"
import { LanguageModelConfiguration, host } from "./host"
import {
AZURE_OPENAI_API_VERSION,
MAX_CACHED_TEMPERATURE,
MAX_CACHED_TOP_P,
MODEL_PROVIDER_OPENAI,
TOOL_ID,
} from "./constants"
Expand Down Expand Up @@ -50,13 +48,10 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
options,
trace
) => {
const { temperature, top_p, seed, tools } = req
const {
requestOptions,
partialCb,
maxCachedTemperature = MAX_CACHED_TEMPERATURE,
maxCachedTopP = MAX_CACHED_TOP_P,
cache: useCache,
cache: cacheOrName,
cacheName,
retry,
retryDelay,
Expand All @@ -69,18 +64,12 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
const { model } = parseModelIdentifier(req.model)
const encoder = await resolveTokenEncoder(model)

const cache = getChatCompletionCache(cacheName)
const caching =
useCache === true || // always use cache
(useCache !== false && // never use cache
seed === undefined && // seed is not cacheable (let the LLM make the run deterministic)
!tools?.length && // assume tools are non-deterministic by default
(isNaN(temperature) ||
isNaN(maxCachedTemperature) ||
temperature < maxCachedTemperature) && // high temperature is not cacheable (it's too random)
(isNaN(top_p) || isNaN(maxCachedTopP) || top_p < maxCachedTopP))
trace.itemValue(`caching`, caching)
const cachedKey = caching
const cache = getChatCompletionCache(
typeof cacheOrName === "string" ? cacheOrName : cacheName
)
trace.itemValue(`caching`, !!cache)
trace.itemValue(`cache`, cache?.name)
const cachedKey = !!cacheOrName
? <ChatCompletionRequestCacheKey>{
...req,
...cfgNoToken,
Expand Down Expand Up @@ -263,7 +252,7 @@ export const OpenAIChatCompletion: ChatCompletionHandler = async (
responseSoFar: chatResp,
tokensSoFar: numTokens,
responseChunk: progress,
inner
inner,
})
}
pref = chunk
Expand Down
6 changes: 3 additions & 3 deletions packages/core/src/types/prompt_template.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,13 @@ interface ModelOptions extends ModelConnectionOptions {
seed?: number

/**
* If true, the prompt will be cached. If false, the LLM chat is never cached.
* Leave empty to use the default behavior.
* By default, LLM queries are not cached. If true, the LLM request will be cached. Use a string to override the default cache name
*/
cache?: boolean
cache?: boolean | string

/**
* Custom cache name. If not set, the default cache is used.
* @deprecated Use `cache` instead with a string
*/
cacheName?: string
}
Expand Down
3 changes: 1 addition & 2 deletions packages/sample/genaisrc/cache.genai.mts
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
script({
model: "openai:gpt-3.5-turbo",
cache: true,
cacheName: "gpt-cache",
cache: "gpt-cache",
tests: [{}, {}], // run twice to trigger caching
})

Expand Down
6 changes: 3 additions & 3 deletions packages/sample/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/node/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/python/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/genaisrc/style/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion packages/sample/genaisrc/summary-of-summary-gpt35.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ for (const file of env.files) {
_.def("FILE", file)
_.$`Summarize FILE. Be concise.`
},
{ model: "gpt-3.5-turbo", cacheName: "summary_gpt35" }
{ model: "gpt-3.5-turbo", cache: "summary_gpt35" }
)
// save the summary in the main prompt
def("FILE", { filename: file.filename, content: text })
Expand Down
4 changes: 2 additions & 2 deletions packages/sample/genaisrc/summary-of-summary-phi3.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ script({
tests: {
files: ["src/rag/*.md"],
keywords: ["markdown", "lorem", "microsoft"],
}
},
})

// summarize each files individually
Expand All @@ -15,7 +15,7 @@ for (const file of env.files) {
_.def("FILE", file)
_.$`Extract keywords for the contents of FILE.`
},
{ model: "ollama:phi3", cacheName: "summary_phi3" }
{ model: "ollama:phi3", cache: "summary_phi3" }
)
def("FILE", { ...file, content: text })
}
Expand Down
6 changes: 3 additions & 3 deletions packages/sample/src/aici/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/errors/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/makecode/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/tla/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions packages/sample/src/vision/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading