Problems in _head_wise_statistics() function #4

JingfenQiao · 2024-06-06T21:29:08Z

File "Ms-PoE/utils/modify_arch/llama.py", line 330, in forward
self.head_order = self._head_wise_statistics(query_states, key_states, q_len, kv_seq_len, bsz, attention_mask)
File "Ms-PoE/utils/modify_arch/llama.py", line 176, in _head_wise_statistics
raise ValueError(
ValueError: Attention weights should be of size (1, 32, 1793, 3586), but is torch.Size([1, 32, 1793, 1793])

The above problem is caused by the following code in llama.py. The new seq_len in attn_weights is not equal to kv_seq_len.

if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
raise ValueError(
f"Attention weights should be of size {(bsz, self.num_heads, q_len, kv_seq_len)}, but is"
f" {attn_weights.size()}"
)

Chungyezun · 2024-10-13T09:25:33Z

Hello, I'm facing a similar issue. Did you manage to find a solution for this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems in _head_wise_statistics() function #4

Problems in _head_wise_statistics() function #4

JingfenQiao commented Jun 6, 2024

Chungyezun commented Oct 13, 2024

Problems in _head_wise_statistics() function #4

Problems in _head_wise_statistics() function #4

Comments

JingfenQiao commented Jun 6, 2024

Chungyezun commented Oct 13, 2024