Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformers does not recognize this architecture #25

Open
DeadLining opened this issue Dec 27, 2024 · 0 comments
Open

Transformers does not recognize this architecture #25

DeadLining opened this issue Dec 27, 2024 · 0 comments

Comments

@DeadLining
Copy link

DeadLining commented Dec 27, 2024

The version of Transformers is 4.47.1, but ms-swift raise the following error:

ValueError: The checkpoint you are trying to load has model type deepseek_vl2 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

ms-swift:
CUDA_VISIBLE_DEVICES=0,1,2,3 \ VIDEO_MAX_PIXELS=50176 \ swift sft \ --model /gpu/nfs/raymodel/deepseek-ai/deepseek-vl2-small \ --train_type lora \ --dataset accident/train.jsonl \ --num_train_epochs 10 \ --per_device_train_batch_size 2 \ --learning_rate 1e-4 \ --lora_rank 8 \ --lora_alpha 32 \ --gradient_accumulation_steps 16 \ --eval_steps 50 \ --save_steps 50 \ --save_total_limit 2 \ --logging_steps 10 \ --target_modules all-linear \ --freeze_llm false \ --freeze_vit false \ --freeze_aligner false

@DeadLining DeadLining changed the title Transformers does Transformers does not recognize this architecture Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant