++Enable caching for LLM requests with configurable cache names #677

pelikhan · 2024-08-29T18:52:40Z

++

The caching functionality in the script execution has been thoroughly revamped. 🔄
- LLM requests are now not cached by default. If you want to enable caching, you can do it via script metadata or CLI arguments.
- There were certain conditions earlier, like temperature < 0.5, top_p < 0.5, seed not used, no functions used, which automatically enabled cache usage. These have been removed.
There have been user-facing changes in the public API.
- The cache option in scripts now accepts a boolean value or a string. If it's true, the script will use caching. If it's a string, this will override the default cache name.
- Additionally, the cacheName is marked as deprecated, replaced by the string option in cache. 💾
Changes have been made to OpenAIChatCompletion in openai.ts.
- Previously, it internally used conditions like temperature, top_p, seed, and tools available to decide if caching was to be done. Now, it directly checks if a cache property has been passed in the request or not.
Constants CACHE_CHAT and GITHUB_PULLREQUEST_REVIEW_COMMENT_LINE_DISTANCE got renamed from chatv2 to chat and 5 to 6 respectively in constants.ts. ⛓
Scripts inside genaisrc have been updated to comply with the recent changes related to caching. The cache and cacheName options are replaced with just cache which can hold a boolean value or a string (cache name).

In essence, this update is focused on giving more control over caching to the user rather than the system deciding it based on certain conditions. 🎛

generated by pr-describe

github-actions · 2024-08-29T18:53:42Z

docs/src/content/docs/reference/scripts/cache.mdx

@@ -8,12 +8,20 @@ keywords: cache management, LLM request caching, script performance, cache file

 import { FileTree } from "@astrojs/starlight/components"



The statement about LLM requests caching has been changed. It was previously stated that LLM requests are cached by default, but the updated content states that they are not cached by default. This is a significant change and should be verified for accuracy.

generated by pr-docs-review-commit content_change

github-actions · 2024-08-29T18:53:42Z

docs/src/content/docs/reference/scripts/cache.mdx

+
+```sh "--cache"
+npx genaiscript run ... --cache
+```


New content has been added to explain how to enable LLM request caching. This includes a JavaScript code snippet and a shell command. Ensure that these instructions are correct and clear for users.

generated by pr-docs-review-commit content_addition

github-actions · 2024-08-29T18:53:43Z

docs/src/content/docs/reference/scripts/cache.mdx

@@ -26,23 +34,6 @@ This file is excluded from git by default.

 </FileTree>



The section on disabling the cache has been removed. If this information is still relevant and useful, consider adding it back to the documentation.

generated by pr-docs-review-commit content_removal

github-actions · 2024-08-29T18:53:44Z

docs/src/content/docs/reference/scripts/cache.mdx

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
 ```js
 script({
    ...,


The property name in the JavaScript code snippet has been changed from 'cacheName' to 'cache'. This could potentially confuse users if not properly explained in the surrounding text.

generated by pr-docs-review-commit content_change

github-actions · 2024-08-29T18:54:44Z

The changes in the GIT_DIFF are:

The removal of the MAX_CACHED_TEMPERATURE and MAX_CACHED_TOP_P constants from constants.ts and their usage from openai.ts.
The CHAT_CACHE variable in constants.ts has been altered from chatv2 to chat.
Changes in openai.ts to alter the caching behaviour. It now checks if cacheOrName is a string and, if true, uses it as the cache name. It no longer checks for the presence of seed, tools, temperature, or top_p to determine whether or not to cache.
prompt_template.d.ts has been modified to change the cache property in the ModelOptions interface. It can now be a boolean or string. If it's a string, it will be used as the cache name.

I have a few concerns:

The removal of the temperature and top_p checks may result in more cache misses as these parameters can significantly affect the output of a machine learning model. This could degrade performance.
The usage of cacheOrName as a string to denote the cache name could cause confusion as its boolean value is also significant. This is not immediately apparent and could lead to future bugs or misuse of the API.

Suggested improvements:

Rename cacheOrName to something more descriptive like cacheControl and add comments explaining its behavior.

- const cache = getChatCompletionCache(
-     typeof cacheOrName === "string" ? cacheOrName : cacheName
- )

+ // cacheControl can be a boolean or a string.
+ // If it's a boolean, it determines whether to use the cache.
+ // If it's a string, it is used as the cache name.
+ const cacheControl = cacheOrName;
+ const cache = getChatCompletionCache(
+     typeof cacheControl === "string" ? cacheControl : cacheName
+ )

Aside from these concerns, the rest of the changes look good. They appear to be part of a refactor to simplify the caching behavior. No obvious security, reliability, scalability, or performance issues are visible from the diff.

generated by pr-review

…rror handling in OpenAI chat completion

github-actions · 2024-08-29T19:10:06Z

docs/src/content/docs/reference/scripts/cache.mdx

+
+```sh "--cache"
+npx genaiscript run ... --cache
+```


New code examples have been added to illustrate how to enable LLM request caching. Ensure these examples are correct and clear to the reader.

generated by pr-docs-review-commit code_example_added

github-actions · 2024-08-29T19:10:06Z

docs/src/content/docs/reference/scripts/cache.mdx

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
 ```js
 script({
    ...,
-    cacheName: "summary"
+    cache: "summary"
 })


The section on disabling the cache has been removed. If this information is still relevant, it should be included in the documentation.

generated by pr-docs-review-commit content_removed

…and vscode packages

github-actions · 2024-08-29T19:13:35Z

docs/src/content/docs/reference/scripts/cache.mdx

@@ -51,7 +42,7 @@ The name will be used to create a file in the `.genaiscript/cache` directory.
 ```js
 script({
    ...,
-    cacheName: "summary"
+    cache: "summary"
 })


The section on disabling the cache has been removed. This information might be important for users who want to disable caching. Consider adding it back or providing an alternative way to disable caching.

generated by pr-docs-review-commit content_removal

github-actions · 2024-08-29T19:13:53Z

packages/cli/src/run.ts

+                `error: run failed with ${exitCode}, retry #${r + 1}/${runRetry} in ${delayMs}ms`
+            )
+            await delay(delayMs)
+        }


The retry logic has been changed to only retry if runRetry is greater than 1. This could potentially skip necessary retries if runRetry is 1, which might lead to unexpected behavior. Please ensure this is the intended behavior.

generated by pr-review-commit retry_logic

github-actions · 2024-08-29T19:13:54Z

packages/core/src/server/messages.ts

@@ -70,7 +70,7 @@ export interface PromptScriptRunOptions {
    model: string


The type of cache has been changed from boolean to boolean | string. This could potentially break existing code that expects cache to be a boolean. Please ensure that all usage of cache has been updated to handle the new type.

generated by pr-review-commit cache_type_change

Enable caching for LLM requests with configurable cache names

cda4475

github-actions bot reviewed Aug 29, 2024

View reviewed changes

Refine error logging and retry logic in CLI run script, and improve e…

abc05bb

…rror handling in OpenAI chat completion

github-actions bot reviewed Aug 29, 2024

View reviewed changes

Allow cache configuration to accept boolean or string values in core …

fac7ebf

…and vscode packages

github-actions bot reviewed Aug 29, 2024

View reviewed changes

pelikhan merged commit 68bb5ca into main Aug 29, 2024
11 checks passed

pelikhan deleted the no-cache-defaults branch August 29, 2024 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

++Enable caching for LLM requests with configurable cache names #677

++Enable caching for LLM requests with configurable cache names #677

pelikhan commented Aug 29, 2024 •

edited by github-actions bot

Loading

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot commented Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

github-actions bot Aug 29, 2024

		@@ -8,12 +8,20 @@ keywords: cache management, LLM request caching, script performance, cache file

		import { FileTree } from "@astrojs/starlight/components"

		@@ -26,23 +34,6 @@ This file is excluded from git by default.

		</FileTree>

		@@ -70,7 +70,7 @@ export interface PromptScriptRunOptions {
		model: string

++Enable caching for LLM requests with configurable cache names #677

++Enable caching for LLM requests with configurable cache names #677

Conversation

pelikhan commented Aug 29, 2024 • edited by github-actions bot Loading

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot commented Aug 29, 2024

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

github-actions bot Aug 29, 2024

Choose a reason for hiding this comment

pelikhan commented Aug 29, 2024 •

edited by github-actions bot

Loading