Replies: 1 comment
-
The default head size is |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Usually, the attention head size
head_dim = hidden_dim // num_attention_heads
in many model architectures including Llama.Some models use more flexible
head_dim
sizes such asFor Llama models, here is one pending PR for HF
Looking at
src/llama.cpp
, I feel like the information is handled around here but I'm not sure.llama.cpp/src/llama.cpp
Lines 4698 to 4702 in 1b6ff90
Could anybody help me understand how the information is loaded into
hparams
and can be used inbuild_*()
?Thank you!
Beta Was this translation helpful? Give feedback.
All reactions