Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

++Enable caching for LLM requests with configurable cache names #677

Merged
merged 3 commits into from
Aug 29, 2024

Conversation

pelikhan
Copy link
Member

@pelikhan pelikhan commented Aug 29, 2024

++

  • The caching functionality in the script execution has been thoroughly revamped. 🔄

    • LLM requests are now not cached by default. If you want to enable caching, you can do it via script metadata or CLI arguments.
    • There were certain conditions earlier, like temperature < 0.5, top_p < 0.5, seed not used, no functions used, which automatically enabled cache usage. These have been removed.
  • There have been user-facing changes in the public API.

    • The cache option in scripts now accepts a boolean value or a string. If it's true, the script will use caching. If it's a string, this will override the default cache name.
    • Additionally, the cacheName is marked as deprecated, replaced by the string option in cache. 💾
  • Changes have been made to OpenAIChatCompletion in openai.ts.

    • Previously, it internally used conditions like temperature, top_p, seed, and tools available to decide if caching was to be done. Now, it directly checks if a cache property has been passed in the request or not.
  • Constants CACHE_CHAT and GITHUB_PULLREQUEST_REVIEW_COMMENT_LINE_DISTANCE got renamed from chatv2 to chat and 5 to 6 respectively in constants.ts. ⛓

  • Scripts inside genaisrc have been updated to comply with the recent changes related to caching. The cache and cacheName options are replaced with just cache which can hold a boolean value or a string (cache name).

In essence, this update is focused on giving more control over caching to the user rather than the system deciding it based on certain conditions. 🎛

generated by pr-describe

@@ -8,12 +8,20 @@ keywords: cache management, LLM request caching, script performance, cache file

import { FileTree } from "@astrojs/starlight/components"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement about LLM requests caching has been changed. It was previously stated that LLM requests are cached by default, but the updated content states that they are not cached by default. This is a significant change and should be verified for accuracy.

generated by pr-docs-review-commit content_change


```sh "--cache"
npx genaiscript run ... --cache
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New content has been added to explain how to enable LLM request caching. This includes a JavaScript code snippet and a shell command. Ensure that these instructions are correct and clear for users.

generated by pr-docs-review-commit content_addition

@@ -26,23 +34,6 @@ This file is excluded from git by default.

</FileTree>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. If this information is still relevant and useful, consider adding it back to the documentation.

generated by pr-docs-review-commit content_removal

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
```js
script({
...,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property name in the JavaScript code snippet has been changed from 'cacheName' to 'cache'. This could potentially confuse users if not properly explained in the surrounding text.

generated by pr-docs-review-commit content_change

Copy link

The changes in the GIT_DIFF are:

  • The removal of the MAX_CACHED_TEMPERATURE and MAX_CACHED_TOP_P constants from constants.ts and their usage from openai.ts.
  • The CHAT_CACHE variable in constants.ts has been altered from chatv2 to chat.
  • Changes in openai.ts to alter the caching behaviour. It now checks if cacheOrName is a string and, if true, uses it as the cache name. It no longer checks for the presence of seed, tools, temperature, or top_p to determine whether or not to cache.
  • prompt_template.d.ts has been modified to change the cache property in the ModelOptions interface. It can now be a boolean or string. If it's a string, it will be used as the cache name.

I have a few concerns:

  1. The removal of the temperature and top_p checks may result in more cache misses as these parameters can significantly affect the output of a machine learning model. This could degrade performance.
  2. The usage of cacheOrName as a string to denote the cache name could cause confusion as its boolean value is also significant. This is not immediately apparent and could lead to future bugs or misuse of the API.

Suggested improvements:

Rename cacheOrName to something more descriptive like cacheControl and add comments explaining its behavior.

- const cache = getChatCompletionCache(
-     typeof cacheOrName === "string" ? cacheOrName : cacheName
- )

+ // cacheControl can be a boolean or a string.
+ // If it's a boolean, it determines whether to use the cache.
+ // If it's a string, it is used as the cache name.
+ const cacheControl = cacheOrName;
+ const cache = getChatCompletionCache(
+     typeof cacheControl === "string" ? cacheControl : cacheName
+ )

Aside from these concerns, the rest of the changes look good. They appear to be part of a refactor to simplify the caching behavior. No obvious security, reliability, scalability, or performance issues are visible from the diff.

generated by pr-review


```sh "--cache"
npx genaiscript run ... --cache
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New code examples have been added to illustrate how to enable LLM request caching. Ensure these examples are correct and clear to the reader.

generated by pr-docs-review-commit code_example_added

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
```js
script({
...,
cacheName: "summary"
cache: "summary"
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. If this information is still relevant, it should be included in the documentation.

generated by pr-docs-review-commit content_removed

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
```js
script({
...,
cacheName: "summary"
cache: "summary"
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on disabling the cache has been removed. This information might be important for users who want to disable caching. Consider adding it back or providing an alternative way to disable caching.

generated by pr-docs-review-commit content_removal

`error: run failed with ${exitCode}, retry #${r + 1}/${runRetry} in ${delayMs}ms`
)
await delay(delayMs)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic has been changed to only retry if runRetry is greater than 1. This could potentially skip necessary retries if runRetry is 1, which might lead to unexpected behavior. Please ensure this is the intended behavior.

generated by pr-review-commit retry_logic

@@ -70,7 +70,7 @@ export interface PromptScriptRunOptions {
model: string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of cache has been changed from boolean to boolean | string. This could potentially break existing code that expects cache to be a boolean. Please ensure that all usage of cache has been updated to handle the new type.

generated by pr-review-commit cache_type_change

@pelikhan pelikhan merged commit 68bb5ca into main Aug 29, 2024
11 checks passed
@pelikhan pelikhan deleted the no-cache-defaults branch August 29, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant