LLMs benchmark notebook (code, math)

Project Description

Notebook structure for evaluating solutions (LLMs). Currently does 5 tests in Python Code and 5 tests in Math, hopefully can grow with translation, general question/answering and basic knowledge, speed (token/sec), and much more!

Currently, the LLMbenchmark notebook has 4 LLM eval/benchmark code: Yi (6B), Vicuna (7B), Mistral (7B), and Gemma (2B). More information and explanations on how to run each notebook and evals will be in the notebook’s first cell.

You can copy and recreate an eval for any LLM by copying the cells of a model and simply changing the model-specific parts with your model’s name. Using the YourOwnLLMbenchmark Kaggle Notebook as a template and by following the instructions in that notebook you may be able to recreate the eval but for the LLM of your choice.

LLMbenchmarkEvals.ipynb: LLM eval/benchmark code for Yi (6B), Vicuna (7B), Mistral (7B), and Gemma (2B), returns the benchmark table for each. More information and details in the first cell of the notebook itself.
LLMbenchmarkEvals-template.ipynb: Copy the cells and change the model-specific spots with your own LLM name that you want to test in this evaluation. More information and details in the first cell of the notebook itself.

Compatibility

Kaggle
Google Collab

Information and explanations on how to run each notebook will be in the top cell of each of the respective notebooks.

Credits and contact

Celeste Deudon
SMILE R&D

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
LICENSE		LICENSE
LLMbenchmarkEvals-template.ipynb		LLMbenchmarkEvals-template.ipynb
LLMbenchmarkEvals.ipynb		LLMbenchmarkEvals.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs benchmark notebook (code, math)

Project Description

Table of contents

Compatibility

Credits and contact

License

About

Releases

Packages

Contributors 2

Languages

License

Smile-SA/llm-benchmark-notebook

Folders and files

Latest commit

History

Repository files navigation

LLMs benchmark notebook (code, math)

Project Description

Table of contents

Compatibility

Credits and contact

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages