You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HELM currently only evaluates on 5 LegalBench tasks. Ideally, we would like to be able to run evaluation on all tasks.
I quickly analyzed the structure of the tasks and their prompts. I found that all tasks contain a base_prompt.txtfile, a train.tsv and a README.md file that could be used to automatically construct a complete helm_prompt_settings.jsonl file.
I saw that the prompts in the helm_prompt_settings.jsonl file are modified versions from base_prompt.txt. Writing the jsonl file for all tasks manually could be a lot of work. Therefore, I would suggest the following:
Extract the first line from base_prompt.txt as general instructions.
Use the train.tsvfile to get the possible answer options and provide those as a second line to the instructions. Here a question: Does the train.tsv file contain all possible answer options for a task at least once?
Use Data column names field from the README.md to build the field_ordering, label_keys and output_nouns.
What do you think? Happy to create a PR for that.
The text was updated successfully, but these errors were encountered:
HELM currently only evaluates on 5 LegalBench tasks. Ideally, we would like to be able to run evaluation on all tasks.
I quickly analyzed the structure of the tasks and their prompts. I found that all tasks contain a
base_prompt.txt
file, atrain.tsv
and aREADME.md
file that could be used to automatically construct a completehelm_prompt_settings.jsonl
file.I saw that the prompts in the
helm_prompt_settings.jsonl
file are modified versions frombase_prompt.txt
. Writing the jsonl file for all tasks manually could be a lot of work. Therefore, I would suggest the following:base_prompt.txt
as general instructions.train.tsv
file to get the possible answer options and provide those as a second line to the instructions. Here a question: Does thetrain.tsv
file contain all possible answer options for a task at least once?Data column names
field from theREADME.md
to build thefield_ordering
,label_keys
andoutput_nouns
.What do you think? Happy to create a PR for that.
The text was updated successfully, but these errors were encountered: