Meta Dataset of Biomedical Instructions (MedINST) is a novel multi-domain, multi-task instructional meta-dataset. MedINST comprises 133 biomedical NLP tasks and over 7 million training samples, making it the most comprehensive biomedical instruction dataset to date. Using MedINST as the meta dataset, we curate MedINST32, a challenging benchmark with different task difficulties aiming to evaluate LLMs' generalization ability.
Complete dataset MedINST can be accessed at [LiinXemmon/MedINST].
MedINST32 Benchmark Dataset can be checked at [LiinXemmon/MedINST32].
LLaMA3-MI: Further fine-tune LLaMA-3-8B-Instruct on 100K samples from MedINST.
MMedL3-MI: Further fine-tune MMed-Llama-3-8B on 100K samples from MedINST. [Google Drive]
We access and evaluate LLMs using OpenAI compatible APIs.
python evaluation.py --name <SAVE_NAME> --dir <SAVE_DIR> --model 'gpt-3.5-turbo' --key <YOUR_KEY>
Add --zero
option to evaluate the model in the zero-shot setting.
You may use vLLM to deploy a model at local.
Specify the URL of the deployed API by the option --base_url
. For example
python evaluation.py --name <SAVE_NAME> --dir <SAVE_DIR> --model <YOUR_MODEL> --key <YOUR_KEY> --base_url "http://localhost:8000/v1"