微调后词表长度不一致怎么办 #5436

topology1 · 2024-09-14T01:28:36Z

Reminder

I have read the README and searched the existing issues.

System Info

您好，intern2.5-20B微调合并后出现了added_tokens.json，转换gguf模型的时候是成功的，加载的时候显示tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544，好像是词表长度不一致，请问怎么办？

Reproduction

model

model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all

dataset

dataset: identity,alpaca_en_demo
template: intern2
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: /home/denghui/share/model/wtgguf/internlm2.5
logging_steps: 5
save_steps: 1000
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 2
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 2.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

eval

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 5

###add
flash_attn: auto
dataset_dir: data
dataset: our_nlp
max_grad_norm: 1.0
warmup_steps: 0
optim: adamw_torch
packing: False
report_to: none
include_num_input_tokens_seen: True
lora_rank: 8
lora_alpha: 16
lora_dropout: 0
以上是sft的参数文件
llamafactory-cli train train_lora/intern2.5-20_lora_sft.yaml

model

model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat
adapter_name_or_path: /home/denghui/share/model/wtgguf/internlm2.5
template: intern2
finetuning_type: lora

export

export_dir: /home/denghui/share/model/wtgguf/intern
export_size: 2
export_device: cpu
export_legacy_format: false
以上是merge的文件
llamafactory-cli export merge_lora/intern_lora_sft.yaml
合并后出现了added_tokens.json，转用llama.cpp的convert_hf_to_gguf.py 转出来的模型，加载的时候崩溃了，
出现tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544，请问应该怎么出来

Expected behavior

有参数能不改变词表长度吗

Others

或者有什么办法重新生成词表

github-actions bot added the pending This problem is yet to be addressed label Sep 14, 2024

hiyouga added wontfix This will not be worked on and removed pending This problem is yet to be addressed labels Nov 2, 2024

hiyouga closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

微调后词表长度不一致怎么办 #5436

微调后词表长度不一致怎么办 #5436

topology1 commented Sep 14, 2024

微调后词表长度不一致怎么办 #5436

微调后词表长度不一致怎么办 #5436

Comments

topology1 commented Sep 14, 2024

Reminder

System Info

Reproduction

model

method

dataset

output

train

eval

model

export

Expected behavior

Others