-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load fine-tuning model #31
Comments
Hi, we still can not load the fine-tuning model by the .pt file, we would be more appreciate if you give me some insight. |
thanks for your kind words and for using TabLLM! It seems like the code does not find the fine-tuned model: I hope that helps! Best |
Hi, Thanks for your information! Do you have some demo code for load fine-tuning model (t0_3B), I have take a few month to address this bug, but it don't work. Thanks agagin! |
You can load the T0 3B model as shown in the readme by setting To load a fine-tuned model you should specify the path of the model in the model loading in the t-few code. I guess the best way is to create a config file that specifies the path as model. However, we never did this manually and instead used the train and eval code of the t-few project, so I do not have code for this, unfortunately. Hope that helps! |
Dear all,
Firstly, thanks for your work, it is real help for us!
I am now use your code to do some fine-tuning work, that is first obtain the best model (.pt file and Cong.josn) from my dataset.
But when I use the best model (fine-tuning) model to interpet new dataset, my code in pl_train.py is:
def get_transformer(config):
tokenizer = AutoTokenizer.from_pretrained(config.origin_model)
model = AutoModelForSeq2SeqLM.from_pretrained(config.origin_model, low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained("exp_out/t03b_heart_numshot2_seed42_ia3_pretrained100k")
model = AutoModelForSeq2SeqLM.from_pretrained("exp_out/t03b_heart_numshot2_seed42_ia3_pretrained100k")
tokenizer.model_max_length = config.max_seq_len
model = modify_transformer(model, config)
return tokenizer, model
the error description is:
Mark experiment t03b_heart_numshot0_seed42_ia3_pretrained100k as claimed
Traceback (most recent call last):
File "/root/miniconda3/envs/tfew/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/root/miniconda3/envs/tfew/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/t-few/src/pl_train.py", line 118, in
main(config)
File "/root/t-few/src/pl_train.py", line 65, in main
tokenizer, model = get_transformer(config)
File "/root/t-few/src/pl_train.py", line 33, in get_transformer
tokenizer = AutoTokenizer.from_pretrained("exp_out/t03b_heart_numshot2_seed42_ia3_pretrained100k")
File "/root/miniconda3/envs/tfew/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 489, in from_pretrained
pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
File "/root/miniconda3/envs/tfew/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 611, in from_pretrained
f"Unrecognized model in {pretrained_model_name_or_path}. "
ValueError: Unrecognized model in exp_out/t03b_heart_numshot2_seed42_ia3_pretrained100k. Should have a model_type key in its config.json, or contain one of the following strings in its name: imagegpt, qdqbert, vision-encoder-decoder, trocr, fnet, segformer, vision-text-dual-encoder, perceiver, gptj, layoutlmv2, beit, rembert, visual_bert, canine, roformer, clip, bigbird_pegasus, deit, luke, detr, gpt_neo, big_bird, speech_to_text_2, speech_to_text, vit, wav2vec2, m2m_100, convbert, led, blenderbot-small, retribert, ibert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, megatron-bert, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta-v2, deberta, flaubert, fsmt, squeezebert, hubert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, speech-encoder-decoder, encoder-decoder, funnel, lxmert, dpr, layo
I would be more appreciate if you give me some insight.
The text was updated successfully, but these errors were encountered: