Skip to content

Commit

Permalink
fix: fix a bug in model tuning feedback (#316)
Browse files Browse the repository at this point in the history
* fix a bug

* fix two bugs
  • Loading branch information
WinstonLiyt authored Sep 24, 2024
1 parent f6c2f77 commit 8aa088d
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 4 deletions.
2 changes: 1 addition & 1 deletion rdagent/scenarios/kaggle/developer/feedback.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ def generate_feedback(self, exp: Experiment, hypothesis: Hypothesis, trace: Trac
# Prepare render dictionary
render_dict = {
"context": self.scen.get_scenario_all_desc(),
"last_hypothesis": trace.hist[-1][0].hypothesis if trace.hist else None,
"last_hypothesis": trace.hist[-1][0] if trace.hist else None,
"last_task_and_code": last_task_and_code,
"last_result": trace.hist[-1][1].result if trace.hist else None,
"hypothesis": hypothesis,
Expand Down
5 changes: 2 additions & 3 deletions rdagent/scenarios/kaggle/prompts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ model_experiment_output_format: |-
}
Usually, a larger model works better than a smaller one. Hence, the parameters should be larger.
model_feedback_generation:
model_tuning_feedback_generation:
system: |-
You are a professional result analysis assistant. You will receive a result and a hypothesis.
Your task is to provide feedback on how well the result supports or refutes the hypothesis by judging from the observation of performance increase or decrease.
Expand Down Expand Up @@ -149,8 +149,7 @@ model_feedback_generation:
{% if last_hypothesis %}
Last Round Information:
Hypothesis: {{last_hypothesis.hypothesis}}
Task: {{last_task}}
Code Implemented: {{last_code}}
Last Task and Code: {{last_task_and_code}}
Result: {{last_result}}
{% else %}
This is the first round. No previous information available. As long as the performance is not too negative (eg.ICIR is greater than 0), treat it as successful. Do not set the threshold too high.
Expand Down

0 comments on commit 8aa088d

Please sign in to comment.