We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,intern2.5-20B微调合并后出现了added_tokens.json,转换gguf模型的时候是成功的,加载的时候显示tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,好像是词表长度不一致,请问怎么办?
model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat
stage: sft do_train: true finetuning_type: lora lora_target: all
dataset: identity,alpaca_en_demo template: intern2 cutoff_len: 1024 max_samples: 1000 overwrite_cache: true preprocessing_num_workers: 16
output_dir: /home/denghui/share/model/wtgguf/internlm2.5 logging_steps: 5 save_steps: 1000 plot_loss: true overwrite_output_dir: true
per_device_train_batch_size: 2 gradient_accumulation_steps: 8 learning_rate: 1.0e-4 num_train_epochs: 2.0 lr_scheduler_type: cosine warmup_ratio: 0.1 bf16: true ddp_timeout: 180000000
val_size: 0.1 per_device_eval_batch_size: 1 eval_strategy: steps eval_steps: 5
model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat adapter_name_or_path: /home/denghui/share/model/wtgguf/internlm2.5 template: intern2 finetuning_type: lora
export_dir: /home/denghui/share/model/wtgguf/intern export_size: 2 export_device: cpu export_legacy_format: false 以上是merge的文件 llamafactory-cli export merge_lora/intern_lora_sft.yaml 合并后出现了added_tokens.json,转用llama.cpp的convert_hf_to_gguf.py 转出来的模型,加载的时候崩溃了, 出现tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,请问应该怎么出来
有参数能不改变词表长度吗
或者有什么办法重新生成词表
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Reminder
System Info
您好,intern2.5-20B微调合并后出现了added_tokens.json,转换gguf模型的时候是成功的,加载的时候显示tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,好像是词表长度不一致,请问怎么办?
Reproduction
model
model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat
method
stage: sft
do_train: true
finetuning_type: lora
lora_target: all
dataset
dataset: identity,alpaca_en_demo
template: intern2
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16
output
output_dir: /home/denghui/share/model/wtgguf/internlm2.5
logging_steps: 5
save_steps: 1000
plot_loss: true
overwrite_output_dir: true
train
per_device_train_batch_size: 2
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 2.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
eval
val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 5
###add
flash_attn: auto
dataset_dir: data
dataset: our_nlp
max_grad_norm: 1.0
warmup_steps: 0
optim: adamw_torch
packing: False
report_to: none
include_num_input_tokens_seen: True
lora_rank: 8
lora_alpha: 16
lora_dropout: 0
以上是sft的参数文件
llamafactory-cli train train_lora/intern2.5-20_lora_sft.yaml
model
model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat
adapter_name_or_path: /home/denghui/share/model/wtgguf/internlm2.5
template: intern2
finetuning_type: lora
export
export_dir: /home/denghui/share/model/wtgguf/intern
export_size: 2
export_device: cpu
export_legacy_format: false
以上是merge的文件
llamafactory-cli export merge_lora/intern_lora_sft.yaml
合并后出现了added_tokens.json,转用llama.cpp的convert_hf_to_gguf.py 转出来的模型,加载的时候崩溃了,
出现tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,请问应该怎么出来
Expected behavior
有参数能不改变词表长度吗
Others
或者有什么办法重新生成词表
The text was updated successfully, but these errors were encountered: