Skip to content

Commit

Permalink
V2: Improve AI review action (#10)
Browse files Browse the repository at this point in the history
- Added tools - function calling.
- Added support for dynamic grading.

---------

Co-authored-by: Bodhish Thomas <[email protected]>
  • Loading branch information
yash-learner and bodhish authored Mar 14, 2024
1 parent 3081628 commit 386743f
Show file tree
Hide file tree
Showing 7 changed files with 246 additions and 133 deletions.
34 changes: 17 additions & 17 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,36 +13,36 @@ env:
TEST_MODE: true
WORKFLOW_FILE_PATH: ./.github/workflows/test.yml
jobs:
test: # make sure the action works on a clean machine without building
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: AI auto review
id: ai-review
uses: ./
with:
ROLE_PROMPT: "You are an advanced English Language Teaching Assistant AI. Your task involves reviewing and providing feedback on student submissions, paying meticulous attention to grammar, punctuation, and style errors."
env:
ROLE_PROMPT: "You are an advanced English Language Teaching Assistant AI. Your task involves grading student submission, paying meticulous attention to grammar, punctuation, and style errors."
USER_PROMPT: |
The conversation should include the following:
The submission is on Writing up a conversation between student and an instructor at Pupilfirst with at least 100 words.
- The specific Discord channel the conversation takes place in.
- The initial question, marked with "Student: ", outlining the student's doubt.
- The instructor's response, labelled with "Instructor: ", that provides a solution.
- A follow-up question for clarification, again starting with "Student: ", to delve into what the instructor meant.
Ensure that the student applies the lessons they learned in the current level:
- Provide context, steps taken, and error messages for both the initial question and the follow-up.
- Frame questions around the "why" and "how" aspects.
- Ask for additional examples, if necessary.
- Thank the instructor in a proper and considerate manner.
The feedback should focus on the following areas (with the ideal condition in brackets):
1. Providing Context & Background (The student delivers clear and detailed context, steps taken, and error messages).
2. Clarity (The conversation is clear and easy to understand throughout).
3. Expressing Thanks (The student thanks the instructor genuinely and appropriately).
4. Appropriate Tone & Etiquette (The student maintains a professional and respectful tone throughout the conversation).
When looking at the student's submission, you should check for the following requirements:
- Provided the context, steps taken, and error messages for both the initial question and the follow-up.
- Framed the questions around the "why" and "how" aspects.
- Asked for additional examples, if necessary.
- Wished the instructor in a proper and considerate manner.
Make sure to identify and highlight all grammar, punctuation, and style errors.
The student's submission will be as follows:
As per the above requirements, add a grading to your feedback.
Choose status as "accepted" grade of the submission as "1" for evaluation_criteria_id 3361 when
- The submission meets the ideal conditions in all areas.
Choose the status as "rejected" and send empty [] for grades property when
- The submission does not meet the ideal conditions in all areas.
${SUBMISSION}
The student's submission is
${SUBMISSION}
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,16 @@ The application uses the following environment variables for configuration:
10. `REVIEW_END_POINT`: This environment variable specifies the URL of the endpoint where the reviews are sent.
11. `REVIEW_BOT_USER_TOKEN`: This environment variable represents the token used for authorization when sending the reviews.
12. `WORKFLOW_FILE_PATH`: The path to your GitHub Actions workflow file. Default value is `.github/workflows/ci.js.yml`. Update this if you use a different path or file name for your workflow.
13. `SKIP_GRADING`: If set to `true`, the action will only create a feedback in the LMS and not send a review to the review endpoint. Default value is `false`.

> Note: You need to specify USER_PROMPT and ROLE_PROMPT mandatorily unless you provide a SYSTEM_PROMPT.
> [!NOTE]
> You need to specify USER_PROMPT and ROLE_PROMPT mandatorily unless you provide a SYSTEM_PROMPT.
## How to Set Environment Variables

In GitHub Actions, you can set environment variables for a specific step in your workflow file (.github/workflows/workflow.yml). Here's an example:

> Note: Use `|` (Literal Block Scalar) intsead of `>` (Folded Block Scalar) when writing prompts spanning multiple lines (see `USER_PROMPT` in the example below).
> [!CAUTION]
> Use `|` (Literal Block Scalar) intsead of `>` (Folded Block Scalar) when writing prompts spanning multiple lines (see `USER_PROMPT` in the example below).
```yaml
name: "English Language Course L1 | Auto Grade"
Expand All @@ -57,6 +58,7 @@ jobs:
id: ai-review
uses: pupilfirst/ai-review-action@v1
env:
OPEN_AI_MODEL: gpt-4-turbo-preview
ROLE_PROMPT: "You are an advanced English Language Teaching Assistant AI. Your task involves reviewing and providing feedback on student submissions, paying meticulous attention to grammar, punctuation, and style errors."
USER_PROMPT: |
The conversation should include the following:
Expand Down
146 changes: 85 additions & 61 deletions app/open_ai_client.rb
Original file line number Diff line number Diff line change
@@ -1,31 +1,35 @@
require 'openai'
require 'yaml'
require "openai"
require "yaml"
require "json"

class OpenAIClient
def initialize
@client = OpenAI::Client.new

@config = extract_relevant_step_configuration
@model = @config.fetch('OPEN_AI_MODEL', "gpt-3.5-turbo")
@temperature = @config.fetch('OPEN_AI_TEMPERATURE', 0.1).to_f
@system_prompt = @config.fetch('SYSTEM_PROMPT', system_prompt_default)
@model = @config.fetch("OPEN_AI_MODEL", "gpt-3.5-turbo")
@temperature = @config.fetch("OPEN_AI_TEMPERATURE", 0.1).to_f
@system_prompt = @config.fetch("SYSTEM_PROMPT", system_prompt_default)

@submission = Submission.new
@reviewer = Reviewer.new(@submission)
end

def extract_relevant_step_configuration
# Load workflow YAML file from the path specified in the environment variable or the default path.
file_path = ENV.fetch('WORKFLOW_FILE_PATH', './.github/workflows/ci.js.yml')
file_path = ENV.fetch("WORKFLOW_FILE_PATH", "./.github/workflows/ci.js.yml")

# Find the job step that uses 'pupilfirst/ai-review-action' or has an ID containing 'ai-review'.
content = YAML.safe_load(File.read(file_path))
content = YAML.safe_load_file(file_path)

@config = content.dig('jobs', 'test', 'steps').find do |step|
( step['uses']&.include?('pupilfirst/ai-review-action') || step['id']&.include?('ai-review') )
end['env']
@config = content.dig("jobs", "test", "steps").find do |step|
(step["uses"]&.include?("pupilfirst/ai-review-action") || step["id"]&.include?("ai-review"))
end["env"]

if @config.nil?
p content

raise 'Could not read configuration from environment variables. Please check the workflow file.'
raise "Could not read configuration from environment variables. Please check the workflow file."
end

@config
Expand All @@ -34,78 +38,98 @@ def extract_relevant_step_configuration
def ask
puts prompt
response = @client.chat(
parameters: {
model: @model,
messages: [
{ role: "system", content: prompt }
],
temperature: @temperature,
})
parameters: {
model: @model,
messages: [
{role: "system", content: prompt}
],
tools: @reviewer.available_tools,
tool_choice: @reviewer.tool_choice,
temperature: @temperature
}
)
puts response
response.dig("choices", 0, "message", "content")

message = response.dig("choices", 0, "message")
if message["role"] == "assistant" && message["tool_calls"]
message["tool_calls"].each do |tool_call|
function_name = tool_call.dig("function", "name")
args_json = tool_call.dig("function", "arguments")
begin
args = JSON.parse(args_json, symbolize_names: true)
return {function_name: function_name, args: args}
rescue JSON::ParserError => e
puts "Error parsing JSON arguments: #{e.message}"
end
end
else
{function_name: "errored", args: {}}
end
end

def prompt
@system_prompt
.gsub("${ROLE_PROMPT}", default_role_prompt)
.gsub("${INPUT_DESCRIPTION}", default_input_prompt)
.gsub("${USER_PROMPT}", default_user_prompt)
.gsub("${SUBMISSION}", "#{Submission.new.checklist}")
.gsub("${OUTPUT_DESCRIPTION}", default_output_prompt)
.gsub("${ROLE_PROMPT}", default_role_prompt)
.gsub("${INPUT_DESCRIPTION}", default_input_prompt)
.gsub("${USER_PROMPT}", default_user_prompt)
.gsub("${SUBMISSION}", "#{@submission.checklist}")
.gsub("${EC_PROMPT}", default_evaluation_criteria_prompt)
.gsub("${SUBMISSION_EC}", "#{@submission.evaluation_criteria}")
end

def system_prompt_default
<<-SYSTEM_PROMPT
#{@config.fetch("ROLE_PROMPT", "${ROLE_PROMPT}")}
<<~SYSTEM_PROMPT
#{@config.fetch("ROLE_PROMPT", "${ROLE_PROMPT}")}
#{@config.fetch("INPUT_DESCRIPTION", "${INPUT_DESCRIPTION}")}
#{@config.fetch("INPUT_DESCRIPTION", "${INPUT_DESCRIPTION}")}
#{@config.fetch("USER_PROMPT", "${USER_PROMPT}")}
#{@config.fetch("USER_PROMPT", "${USER_PROMPT}")}
#{@config.fetch("OUTPUT_DESCRIPTION", "${OUTPUT_DESCRIPTION}")}
SYSTEM_PROMPT
#{@config.fetch("EC_PROMPT", "${EC_PROMPT}")}
SYSTEM_PROMPT
end

def default_role_prompt
<<-ROLE_PROMPT
You are an advanced Teaching Assistant AI. Your task involves reviewing and providing feedback on student submissions.
ROLE_PROMPT
<<~ROLE_PROMPT
You are an advanced Teaching Assistant AI. Your task involves reviewing and providing feedback on student submissions.
ROLE_PROMPT
end

def default_user_prompt
<<-USER_PROMPT
The student's submission will be as follows:
${SUBMISSION}
USER_PROMPT
<<~USER_PROMPT
The student's submission will be as follows:
${SUBMISSION}
USER_PROMPT
end

def default_input_prompt
<<-INPUT_PROMPT
The student's submissions will be an array of objects following the provided schema:
```json
{
"kind": "The type of answer - can be shortText, longText, link, files, or multiChoice",
"title": "The question that was asked of the student",
"result": "The student's response",
"status": "Field for internal use; ignore this field during your review"
}
```
INPUT_PROMPT
end
<<~INPUT_PROMPT
The student's submissions will be an array of objects following the provided schema:
def default_output_prompt
<<-OUTPUT_PROMPT
Please provide your response in the following JSON format. Adhere to the format strictly and escape all line-breaks within strings using \\\\n.
{
"kind": "The type of answer - can be shortText, longText, link, files, or multiChoice",
"title": "The question that was asked of the student",
"result": "The student's response",
"status": "Field for internal use; ignore this field during your review"
}
```json
{
"status": "\"passed\" or \"failed\"",
"feedback": "Detailed feedback for the student in markdown format. Aim for a human-like explanation as much as possible."
}
```
INPUT_PROMPT
end

If the student submission is not related to question, share generic feedback.
OUTPUT_PROMPT
def default_evaluation_criteria_prompt
if @submission.evaluation_criteria.any?
<<~EC_PROMPT
The following describes an array of objects where each object represents an evaluation criterion for a submission. Each criterion object includes the following key attributes:
- id: This key stores the identifier for the evaluation criteria, which can be either a numeric value or a string.
- name: The name of the evaluation criterion, describing the aspect of the submission it assesses.
- max_grade: The maximum grade that can be assigned for this criterion.
- grade_labels: An array of objects, each containing a 'grade' and a 'label'. 'grade' is an integer representing a possible grade for the criterion, and 'label' is a description of what this grade signifies.
Below is the structured representation of the evaluation criteria for the current submission:
${SUBMISSION_EC}
EC_PROMPT
else
""
end
end
end
49 changes: 22 additions & 27 deletions app/pupilfirst_api.rb
Original file line number Diff line number Diff line change
@@ -1,32 +1,31 @@
require 'json'
require 'graphql/client'
require 'graphql/client/http'
require_relative 'submission'
require "json"
require "graphql/client"
require "graphql/client/http"
require_relative "submission"

# Pupilfirst API example wrapper
module PupilfirstAPI

module API
HTTP = GraphQL::Client::HTTP.new(ENV.fetch('REVIEW_END_POINT')) do
HTTP = GraphQL::Client::HTTP.new(ENV.fetch("REVIEW_END_POINT")) do
def headers(_context)
{ "Authorization": "Bearer #{ENV.fetch('REVIEW_BOT_USER_TOKEN')}" }
{Authorization: "Bearer #{ENV.fetch("REVIEW_BOT_USER_TOKEN")}"}
end
end

Schema = GraphQL::Client.load_schema('/app/graphql_schema.json')
Schema = GraphQL::Client.load_schema("/app/graphql_schema.json")

Client = GraphQL::Client.new(schema: Schema, execute: HTTP)
end

GradeMutation = API::Client.parse <<-'GRAPHQL'
GradeMutation = API::Client.parse <<-GRAPHQL
mutation($submissionId: ID!, $grades: [GradeInput!], $checklist: JSON!, $feedback: String) {
createGrading(submissionId: $submissionId, grades: $grades, checklist: $checklist, feedback: $feedback) {
success
}
}
GRAPHQL

CreateFeedbackMutation = API::Client.parse <<-'GRAPHQL'
CreateFeedbackMutation = API::Client.parse <<-GRAPHQL
mutation($submissionId: ID!, $feedback: String!) {
createFeedback(submissionId: $submissionId, feedback: $feedback) {
success
Expand All @@ -37,56 +36,52 @@ def headers(_context)
class Grader
def initialize(submission = Submission.new)
@submission = submission
@test_mode = ENV.fetch('TEST_MODE', 'false') == 'true'
@test_mode = ENV.fetch("TEST_MODE", "false") == "true"
end

def grade(result)
return puts "Unknown status: #{result['status'].inspect}. Skipping grading..." unless valid_status?(result['status'])
return puts "Unknown status: #{result[:status].inspect}. Skipping grading..." unless valid_status?(result[:status])

variables = {
submissionId: @submission.id,
checklist: @submission.checklist,
feedback: result['feedback']
feedback: result[:feedback]
}

grades = grades_based_on(result['status'])
# We can use the value of the result[:grades] but we are using following method to handle the case when model hallucinates the grades for a rejected submission.
grades = grades_based_on(result)

variables[:grades] = grades if grades.length > 0

log_variables(variables) if @test_mode
create_grading(variables) unless @test_mode
rescue StandardError => e
rescue => e
handle_error(e)
end

def add_feedback(result)
variables = {
submissionId: @submission.id,
feedback: result['feedback']
feedback: result[:feedback]
}

log_variables(variables) if @test_mode
create_feedback(variables) unless @test_mode
rescue StandardError => e
rescue => e
handle_error(e)
end

private

def valid_status?(status)
%w[passed failed].include?(status)
%w[accepted rejected].include?(status)
end

def grades_based_on(status)
if status == 'passed'
return @submission.evaluation_criteria.map do |criteria|
{
evaluationCriterionId: criteria['id'],
grade: criteria['max_grade']
}
end
def grades_based_on(result)
if result[:status] == "accepted"
result[:grades]
else
return []
[]
end
end

Expand Down
Loading

0 comments on commit 386743f

Please sign in to comment.