-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add NormalizedConfig support qwen, baichuan, chatglm #1490
base: main
Are you sure you want to change the base?
add NormalizedConfig support qwen, baichuan, chatglm #1490
Conversation
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
"mixformer-sequential": GPTBigCodeNormalizedTextConfig, | ||
"baichuan": NormalizedTextConfig, | ||
"qwen": NormalizedTextConfig, | ||
"chatglm": NormalizedTextConfig.with_args(num_layers="num_layers"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the vocab size, shouldn't it be padded_vocab_size
for ChatGLM models ?
https://huggingface.co/THUDM/chatglm3-6b/blob/main/config.json#L32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chatglm has 3 models , I find chatglm is vocab_size
, but chatglm2&3 is padded_vocab_size
,Could you help me deal with this situation? @echarlaix
https://huggingface.co/THUDM/chatglm-6b/blob/main/config.json#L27
Phi, Mixtral has been added to transformers, so add the Phi and Mixtral with #1625 first. |
@@ -262,6 +262,11 @@ class NormalizedConfigManager: | |||
"whisper": WhisperLikeNormalizedTextConfig, | |||
"xlm-roberta": NormalizedTextConfig, | |||
"yolos": NormalizedVisionConfig, | |||
"mpt": MPTNormalizedTextConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mpt is already in the list.
"baichuan": NormalizedTextConfig, | ||
"qwen": NormalizedTextConfig, | ||
"chatglm": NormalizedTextConfig.with_args(num_layers="num_layers"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qwen2
is now available in transformers.
baichuan
and chatglm
are not.
What does this PR do?
Fixes # (issue)
Before submitting