Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Eval Results for HumanEval #69

Open
ncoop57 opened this issue Oct 9, 2021 · 5 comments
Open

Update Eval Results for HumanEval #69

ncoop57 opened this issue Oct 9, 2021 · 5 comments

Comments

@ncoop57
Copy link
Collaborator

ncoop57 commented Oct 9, 2021

Need to update the HumanEval results due to bug that was originally in our evaluation code and was fixed in this PR #62

@neubig
Copy link
Collaborator

neubig commented Oct 9, 2021

I'm interested in this! Are there any directions on how to run the HumanEval evaluation? I looked through the "evaluation" directory briefly but it wasn't immediately obvious what to do.

@ncoop57
Copy link
Collaborator Author

ncoop57 commented Oct 9, 2021

No, not yet I'm actually creating it right now as there are a few weird dependencies. I'll post here once I have the README up. Would be awesome to get some help from you @neubig on running it as I'm having some weird behavior when running the completions!

@neubig
Copy link
Collaborator

neubig commented Oct 9, 2021

Great, I'll try to take a look once there's a bit more doc.

@ncoop57
Copy link
Collaborator Author

ncoop57 commented Oct 9, 2021

Just uploaded a readme in the evaluation folder that has the steps. Hadn't had the chance to test it but it should hopefully get you able to start evaluating a model

@reshinthadithyan
Copy link
Collaborator

I can reproduce the set-up based on the readme.

Just uploaded a readme in the evaluation folder that has the steps. Hadn't had the chance to test it but it should hopefully get you able to start evaluating a model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants