Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prompty parser #687

Merged
merged 24 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
182b72a
added prompty parser
pelikhan Sep 4, 2024
70b8b05
add git hooks docs
pelikhan Sep 4, 2024
ad3cf72
Add support for converting .prompty files to genaiscript in CLI and c…
pelikhan Sep 4, 2024
af3bde9
add mustache loop
pelikhan Sep 4, 2024
b30095c
Refactor code formatting and enhance script generation with template …
pelikhan Sep 4, 2024
2822fe4
Refactor and enhance configurations in prompty.ts and update script m…
pelikhan Sep 4, 2024
532b5ab
Refactor code formatting and enhance model parameter handling in prom…
pelikhan Sep 4, 2024
721668b
Standardize formatting and add temperature and maxTokens settings in …
pelikhan Sep 4, 2024
82f5a5a
add output option
pelikhan Sep 4, 2024
24cec98
updated profile
pelikhan Sep 4, 2024
523fa43
Refactor code to support `.prompty` files and fix indentation in docs
pelikhan Sep 4, 2024
48e72a7
Refine regex pattern in `genaiscript.fragment.prompt` condition in VS…
pelikhan Sep 4, 2024
3937fa2
Add documentation for running Prompty files in GenAIScript
pelikhan Sep 4, 2024
4be861f
Add note about image support in prompty documentation
pelikhan Sep 4, 2024
e80e9c3
Add Prompty card to documentation index with an example
pelikhan Sep 4, 2024
9e40068
trace screenshot
pelikhan Sep 4, 2024
35cf3e7
Merge branch 'promptyrun' of https://github.com/microsoft/genaiscript…
pelikhan Sep 4, 2024
e604b09
support for tests of prompty files
pelikhan Sep 4, 2024
35ce3ad
add prompty to readme
pelikhan Sep 4, 2024
d7040d0
Add meta object initialization in promptyParse results for empty and …
pelikhan Sep 4, 2024
fad5773
Refactor message formatting in promptyToGenAIScript function and remo…
pelikhan Sep 4, 2024
2f8c934
Add typecheck to pre-commit hook and fix assignment typo in promptyTo…
pelikhan Sep 4, 2024
1879019
Add Evals to .vscode settings and update section title in README.md
pelikhan Sep 4, 2024
67be940
fix sample tests
pelikhan Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"tldraw-org.tldraw-vscode",
"bierner.emojisense",
"github.vscode-pull-request-github",
"ms-toolsai.prompty"
"ms-toolsai.prompty",
"unifiedjs.vscode-mdx"
]
}
44 changes: 37 additions & 7 deletions docs/src/content/docs/guides/auto-git-commit-message.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
The script acts as a regular node.js automation script and uses [runPrompt](/genaiscript/reference/scripts/inner-prompts)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect heading formatting, should use markdown heading style with '#'.

generated by pr-docs-review-commit incorrect_formatting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect header formatting, should use Markdown header style.

generated by pr-docs-review-commit mdx_header_format

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect heading format, remove emoji from heading.

generated by pr-docs-review-commit mdx_heading_format

to issue calls to the LLM and ask the user to confirm the generated text.

## 🔍 **Explaining the Script**
## Explaining the Script

First, we check if there are any staged changes in the Git repository:

Expand All @@ -24,7 +24,7 @@
If no changes are staged, we ask the user if they want to stage all changes. If the user confirms, we stage all changes. Otherwise, we bail out.

```ts
const stage = await host.confirm("No staged changes. Stage all changes?", {
const stage = await host.confirm("No staged changes. Stage all changes?", {
default: true,
})
if (stage) {
Expand Down Expand Up @@ -70,8 +70,7 @@

```ts
if (choice === "edit") {
message = await host.input("Edit commit message",
{ required: true })
message = await host.input("Edit commit message", { required: true })
choice = "commit"
}
```
Expand All @@ -84,7 +83,7 @@
}
```

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect heading formatting, should use markdown heading style with '#'.

generated by pr-docs-review-commit incorrect_formatting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect header formatting, should use Markdown header style.

generated by pr-docs-review-commit mdx_header_format

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect heading format, remove emoji from heading.

generated by pr-docs-review-commit mdx_heading_format

## 🚀 **Running the Script**
## Running the Script

You can run this script using the [CLI](/genaiscript/reference/cli).

Expand All @@ -96,8 +95,11 @@

```json '"gcm": "genaiscript run gcm"'
{
"devDependencies": {
"genaiscript": "*"
},
"scripts": {
"gcm": "npx --yes genaiscript run gcm"
"gcm": "genaiscript run gcm"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect JSON format, missing curly brace to close the JSON object.

generated by pr-docs-review-commit json_format

pelikhan marked this conversation as resolved.
Show resolved Hide resolved
}
```
Expand All @@ -108,7 +110,35 @@
npm run gcm
```

## Using git hooks

You can also attach to the [commit-msg](https://git-scm.com/docs/githooks#_commit_msg) git hook to run the message generation on demand.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo found, change "huksy" to "husky".

generated by pr-docs-review-commit typo

Using the [huksy](https://typicode.github.io/husky/) framework, we can register the execution
of genaiscript in the `.husky/commit-msg` file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect tool name 'huksy'; the correct name is 'Husky'.

generated by pr-docs-review-commit incorrect_tool_name


The `commit-msg` hook receives a file location where the message is stored. We pass this parameter to the script
so that it gets populated in the `env.files` variable.

```bash title=".husky/commit-msg"
npx --yes genaiscript run commit-msg "$1"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect usage of 'npx --yes'; it should be removed as it is not necessary when running local npm scripts or dependencies.

generated by pr-docs-review-commit incorrect_usage

```

In the script, we check if the content of the file already has a user message, otherwize generate a new message.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo found, change "otherwize" to "otherwise".

generated by pr-docs-review-commit typo

pelikhan marked this conversation as resolved.
Show resolved Hide resolved

```js title="commit-msg.genai.mts"
const msg = env.files[0] // file created by git to hold the message
const msgContent = msg.content // check if the user added any message
?.split(/\n/g)
.filter((l) => l && !/^#/.test(l)) // filter out comments
.join("\n")
if (msgContent) cancel("commit message already exists")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments in JavaScript should start with // or /*, not #.

generated by pr-docs-review-commit js_comment_handling

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure the logic for checking if a commit message already exists is correct and handles edge cases properly.

generated by pr-docs-review-commit script_logic


...

await host.writeText(msg.filename, message)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect code comment 'otherwize'; the correct spelling is 'otherwise'.

generated by pr-docs-review-commit incorrect_code_comment

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code comment 'cancel("commit message already exists")' suggests using a function 'cancel' which is not defined or explained in the provided code snippet.

generated by pr-docs-review-commit incorrect_code_comment

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that the script correctly writes the commit message to the file specified by the git hook.

generated by pr-docs-review-commit script_logic

```

Check failure on line 139 in docs/src/content/docs/guides/auto-git-commit-message.mdx

View workflow job for this annotation

GitHub Actions / build

The section "Using git hooks" introduces the use of a third-party tool called "huksy" which is likely a typo and should be corrected to "husky".
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section "Using git hooks" introduces the use of a third-party tool called "huksy" which is likely a typo and should be corrected to "husky".

generated by pr-docs-review-commit documentation_content


## Acknowledgements

This script was inspired from Karpathy's
This script was inspired from Karpathy's
[commit message generator](https://gist.github.com/karpathy/1dd0294ef9567971c1e4348a90d69285).
18 changes: 17 additions & 1 deletion docs/src/content/docs/reference/cli/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@
title: Commands
description: List of all CLI commands
sidebar:
order: 100
order: 100
---

<!-- autogenerated, do not edit -->

A full list of the CLI command and its respective help text.

## `run`
Expand Down Expand Up @@ -308,7 +309,8 @@
code <file> [query] Parse code using tree sitter and executes a
query
tokens [options] <files...> Count tokens in a set of files
jsonl2json Converts JSONL files to a JSON file

Check failure on line 312 in docs/src/content/docs/reference/cli/commands.md

View workflow job for this annotation

GitHub Actions / build

The command `prompty` is mentioned but it seems to be a duplicate of the `parse prompty` command which is explained in detail later in the document. This could lead to confusion and should be clarified or removed if it's indeed a duplicate.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command prompty is mentioned but it seems to be a duplicate of the parse prompty command which is explained in detail later in the document. This could lead to confusion and should be clarified or removed if it's indeed a duplicate.

generated by pr-docs-review-commit documentation_content

prompty <file...> Converts .prompty files to genaiscript
```

### `parse fence`
Expand Down Expand Up @@ -388,7 +390,21 @@
Options:
-h, --help display help for command
```

### `parse prompty`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect heading format, remove the emoji to match MDX standards.

generated by pr-docs-review-commit mdx_heading_format

pelikhan marked this conversation as resolved.
Show resolved Hide resolved

```
Usage: genaiscript parse prompty [options] <file...>

Converts .prompty files to genaiscript

Arguments:
file input JSONL files

Options:
-h, --help display help for command

Check failure on line 405 in docs/src/content/docs/reference/cli/commands.md

View workflow job for this annotation

GitHub Actions / build

The section `parse prompty` is introduced but the usage description incorrectly states "input JSONL files" for the `file` argument. Since this command is for converting `.prompty` files, the description should be corrected to reflect the appropriate file type.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section parse prompty is introduced but the description "Converts .prompty files to genaiscript" is not accurate since .prompty is not a recognized file extension. This should be corrected to reflect the actual functionality of the command.

generated by pr-docs-review-commit documentation_content

```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate content, the parse prompty command is documented twice.

generated by pr-docs-review-commit mdx_duplicate_content


## `workspace`

```
Expand Down
10 changes: 5 additions & 5 deletions genaisrc/test-gen.genai.mjs
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
script({
model: "openai:gpt-4",
title: "unit test generator",
system: ["system", "system.typescript", "system.files"],
tools: ["fs"],
})

const code = def("CODE", env.files)


$`## Step 1

For each file in ${code},
generate a plan to test the source code in each file

- use input test files from packages/sample/src/rag/*
- generate self-contained tests as much as possible by inlining all necessary values
- if needed, use input test files from packages/sample/src/rag/*
- only generate tests for files in ${code}
- update the existing test files (<code filename>.test.ts). keep old tests if possible.

Expand Down Expand Up @@ -41,7 +42,7 @@ ${fence('import test, { beforeEach, describe } from "node:test"', { language: "j

Validate and fix test sources.

Use 'run_test' tool to execute the generated test code and fix the test code to make tests pass.
Call the 'run_test' tool to execute the generated test code and fix the test code to make tests pass.

- this is important.
`
Expand All @@ -55,8 +56,7 @@ defTool(
},
async (args) => {
const { filename, source } = args
if (source)
await workspace.writeText(filename, source)
if (source) await workspace.writeText(filename, source)
console.debug(`running test code ${filename}`)
return host.exec(`node`, ["--import", "tsx", "--test", filename])
}
Expand Down
6 changes: 6 additions & 0 deletions packages/cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
parseHTMLToText,
parsePDF,
parseTokens,
prompty2genaiscript,
} from "./parse"
import { compileScript, createScript, fixScripts, listScripts } from "./scripts"
import { codeQuery } from "./codequery"
Expand Down Expand Up @@ -299,6 +300,11 @@
.command("jsonl2json", "Converts JSONL files to a JSON file")
.argument("<file...>", "input JSONL files")
.action(jsonl2json)
parser
.command("prompty")
.description("Converts .prompty files to genaiscript")
.argument("<file...>", "input JSONL files")
.action(prompty2genaiscript)

Check failure on line 307 in packages/cli/src/cli.ts

View workflow job for this annotation

GitHub Actions / build

The description for the argument "<file...>" should be "input .prompty files" instead of "input JSONL files".
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument description for the "prompty" command is incorrect. It should not be "input JSONL files" but rather "input .prompty files".

generated by pr-review-commit incorrect_argument_description

pelikhan marked this conversation as resolved.
Show resolved Hide resolved
pelikhan marked this conversation as resolved.
Show resolved Hide resolved

const workspace = program
.command("workspace")
Expand Down
13 changes: 13 additions & 0 deletions packages/cli/src/parse.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import { YAMLStringify } from "../../core/src/yaml"
import { resolveTokenEncoder } from "../../core/src/encoders"
import { DEFAULT_MODEL } from "../../core/src/constants"
import { promptyParse, promptyToGenAIScript } from "../../core/src/prompty"

export async function parseFence(language: string, file: string) {
const res = await parsePdf(file)
Expand Down Expand Up @@ -69,3 +70,15 @@
}
console.log(text)
}

export async function prompty2genaiscript(files: string[]) {
const fs = await expandFiles(files)
for (const f of fs) {
console.log(f)
const gf = replaceExt(f, ".genai.mts")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no error handling for the promptyParse function. If the parsing fails, the error will not be caught and the application may crash.

generated by pr-review-commit missing_error_handling

const content = await readText(f)
const doc = promptyParse(content)
const script = promptyToGenAIScript(doc)

Check failure on line 81 in packages/cli/src/parse.ts

View workflow job for this annotation

GitHub Actions / build

There is no error handling for the async operations readText and writeText. If these operations fail, the error will not be caught and the program may crash unexpectedly.
await writeText(gf, script)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no error handling for the async operations within the "prompty2genaiscript" function. If an error occurs during file reading or writing, it will not be caught and handled.

generated by pr-review-commit missing_error_handling

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions 'expandFiles', 'replaceExt', 'readText', and 'writeText' are used but not imported or defined in this file.

generated by pr-review-commit missing_dependency

4 changes: 3 additions & 1 deletion packages/core/src/json5.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* eslint-disable curly */
import { parse } from "json5"
import { parse, stringify } from "json5"
import { jsonrepair } from "jsonrepair"

export function isJSONObjectOrArray(text: string) {
Expand Down Expand Up @@ -61,3 +61,5 @@ export function JSONLLMTryParse(s: string): any {
s = s.replace(startRx, "").replace(endRx, "")
return JSON5TryParse(s)
}

export const JSON5Stringify = stringify
2 changes: 1 addition & 1 deletion packages/core/src/mustache.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ export async function interpolateVariables(

// remove prompty roles
// https://github.com/microsoft/prompty/blob/main/runtime/prompty/prompty/parsers.py#L113C21-L113C77
content = content.replace(/^\s*(system|user):\s*$/gim, "\n")
content = content.replace(/^\s*(system|user|assistant)\s*:\s*$/gim, "\n")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular expression used to replace roles in the content has been updated to include 'assistant' along with 'system' and 'user'. This change might affect the existing functionality if the 'assistant' role was not intended to be replaced in the content.

generated by pr-review-commit regex_update


// remove xml tags
// https://humanloop.com/docs/prompt-file-format
Expand Down
75 changes: 75 additions & 0 deletions packages/core/src/prompty.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
import { promptyParse } from "./prompty"
import { describe, test, beforeEach } from "node:test"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The imported modules 'node:test' and 'node:assert/strict' might not exist. Please ensure that these modules are available in the node environment.

generated by pr-review-commit import_error

import assert from "node:assert/strict"

describe("promptyParse", () => {
test("correctly parses an empty markdown string", () => {
const result = promptyParse("")
assert.deepStrictEqual(result, {
frontmatter: {},
content: "",
messages: [],
})
})

test("correctly parses a markdown string without frontmatter", () => {
const content = "This is a sample content without frontmatter."
const result = promptyParse(content)
assert.deepStrictEqual(result, {
frontmatter: {},
content: content,
messages: [{ role: "system", content: content }],
})
})

test("correctly parses a markdown string with valid frontmatter", () => {
const markdownString = `---
name: Test
description: A test description
version: 1.0.0
authors:
- Author1
- Author2
tags:
- tag1
- tag2
sample:
key: value
---
# Heading
Content below heading.`
const result = promptyParse(markdownString)
assert.deepStrictEqual(result.frontmatter, {
name: "Test",
description: "A test description",
version: "1.0.0",
authors: ["Author1", "Author2"],
tags: ["tag1", "tag2"],
sample: { key: "value" },
})
assert.strictEqual(result.content, "# Heading\nContent below heading.")
})

test("correctly parses a markdown string with content split into roles", () => {
const markdownContent = `user:
User's message
assistant:
Assistant's reply
user:
Another message from the user`
const result = promptyParse(markdownContent)
assert.deepStrictEqual(result.messages, [
{ role: "user", content: "User's message" },
{ role: "assistant", content: "Assistant's reply" },
{ role: "user", content: "Another message from the user" },
])
})

test("correctly handles a markdown string with content but without roles", () => {
const markdownContent = `Just some content without specifying roles.`
const result = promptyParse(markdownContent)
assert.deepStrictEqual(result.messages, [
{ role: "system", content: markdownContent },
])
})
})
Loading
Loading