Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

微调后词表长度不一致怎么办 #5436

Closed
1 task done
topology1 opened this issue Sep 14, 2024 · 0 comments
Closed
1 task done

微调后词表长度不一致怎么办 #5436

topology1 opened this issue Sep 14, 2024 · 0 comments
Labels
wontfix This will not be worked on

Comments

@topology1
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

您好,intern2.5-20B微调合并后出现了added_tokens.json,转换gguf模型的时候是成功的,加载的时候显示tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,好像是词表长度不一致,请问怎么办?

Reproduction

model

model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all

dataset

dataset: identity,alpaca_en_demo
template: intern2
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: /home/denghui/share/model/wtgguf/internlm2.5
logging_steps: 5
save_steps: 1000
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 2
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 2.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

eval

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 5

###add
flash_attn: auto
dataset_dir: data
dataset: our_nlp
max_grad_norm: 1.0
warmup_steps: 0
optim: adamw_torch
packing: False
report_to: none
include_num_input_tokens_seen: True
lora_rank: 8
lora_alpha: 16
lora_dropout: 0
以上是sft的参数文件
llamafactory-cli train train_lora/intern2.5-20_lora_sft.yaml

model

model_name_or_path: /home/denghui/share/model/org/Shanghai_AI_Laboratory/internlm2_5-20b-chat
adapter_name_or_path: /home/denghui/share/model/wtgguf/internlm2.5
template: intern2
finetuning_type: lora

export

export_dir: /home/denghui/share/model/wtgguf/intern
export_size: 2
export_device: cpu
export_legacy_format: false
以上是merge的文件
llamafactory-cli export merge_lora/intern_lora_sft.yaml
合并后出现了added_tokens.json,转用llama.cpp的convert_hf_to_gguf.py 转出来的模型,加载的时候崩溃了,
出现tensor 'token_embd.weight' has wrong shape; expected 6144, 92550, got 6144, 92544,请问应该怎么出来

Expected behavior

有参数能不改变词表长度吗

Others

或者有什么办法重新生成词表

@github-actions github-actions bot added the pending This problem is yet to be addressed label Sep 14, 2024
@hiyouga hiyouga added wontfix This will not be worked on and removed pending This problem is yet to be addressed labels Nov 2, 2024
@hiyouga hiyouga closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants