Yarn rope settings only appear in deepseek and phi3 code, any reason? #10282
bartowski1182
started this conversation in
General
Replies: 1 comment 2 replies
-
I am sorry, I don't know what's the logic for adding rope scaling settings or why it is not done more generally for all the models. @Galunid or @compilade may have the answer. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Was looking through convert_hf_to_gguf.py, particularly at Yarn rope scaling settings, and I can only find references to it in Deepseek and Phi3 code.. I know Qwen2.5 uses Yarn to extend context (but not by default on their main uploads), and was wondering - if one added the yarn settings to config.json and extended the context to 128k, this would not get saved into the GGUF metadata right? Is there any reason for that?
Could it be added more universally?
Tagging @slaren (sorry for ping, but you're the most likely to have an answer)
Beta Was this translation helpful? Give feedback.
All reactions