You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran through the modelbench README.md. I cloned the repo, set everything up, and executed the "Running Your First Benchmark" without issue. I then followed the step under "Using the Journal" and saw the results for prompt id airr_practice_1_0_41321, and the scores for the various models. I think I understand what the sample prompt is, and what it is trying to accomplish. Is there any further documentation on airr_practice_1_0_41321 ?
Now I am not sure what to do next. Are there other prompts or prompt suites I can test in a similar matter? I did checkout the ML Common website, and the white paper, but I am not sure what else I can execute besides the practice prompt.
Would be willing to contribute to the README.md for next steps if someone can guide me on what my options are to proceed with executing other prompts.
The text was updated successfully, but these errors were encountered:
Congrats on release
v1.0.0
last week!I ran through the modelbench
README.md
. I cloned the repo, set everything up, and executed the "Running Your First Benchmark" without issue. I then followed the step under "Using the Journal" and saw the results for prompt idairr_practice_1_0_41321
, and the scores for the various models. I think I understand what the sample prompt is, and what it is trying to accomplish. Is there any further documentation onairr_practice_1_0_41321
?Now I am not sure what to do next. Are there other prompts or prompt suites I can test in a similar matter? I did checkout the ML Common website, and the white paper, but I am not sure what else I can execute besides the practice prompt.
Would be willing to contribute to the README.md for next steps if someone can guide me on what my options are to proceed with executing other prompts.
The text was updated successfully, but these errors were encountered: