RuntimeError: The size of tensor unmatched #4018
Replies: 1 comment
-
I am seeing this error as well. This happens with multiple models and multiple model loaders (i.e. llama.cpp, ExLLama). I've seen this model with Nous-Hermes-13B-SuperHOT-8K-GPTQ, brittlewis12_Kunoichi-DPO-v2-7B-GGUF, TheBloke_Echidna-13B-v0.2-GGUF, Blue-Orchid-2x7b_GGUF, etc...) I tried increasing the context len (with appropriate rope scaling values), but no luck. So it turns out that the error, in my case was silero_tts barfing on long sequences i.e. #3653 |
Beta Was this translation helpful? Give feedback.
-
Hello. Thanks to WebUI, I've been enjoying using the text generation AI.
I have a question regarding an issue I encountered while using the japanese-stablelm-instruct-alpha-7b (the 4-bit model).
I'm experiencing a runtime error, which doesn't occur immediately upon startup but tends to happen after a certain amount of conversation.
Once it occurs, I can no longer continue the conversation.
Deleting the conversation log seems to resolve the error, suggesting that there might be some error during the compression of the conversation log.
I've tried adjusting parameters in an attempt to resolve the error, but I couldn't find any parameters that would improve the situation.
Do you know of any solutions to resolve this error?
I'll provide the startup command and traceback below.
execution command
!python server.py --share --settings /content/drive/MyDrive/settings.yaml --wbits 4 --groupsize 128 --trust-remote-code --model_type gpt_neox --model /content/drive/MyDrive/text-generation-webui/models/japanese-stablelm-instruct-alpha-7b-s
traceback
Traceback (most recent call last):
File "/content/drive/MyDrive/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/content/drive/MyDrive/text-generation-webui/modules/text_generation.py", line 323, in generate_with_callback
shared.model.generate(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1642, in generate
return self.sample(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2724, in sample
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 601, in forward
outputs = self.transformer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 203, in forward
outputs = layer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 260, in forward
attention_layer_outputs = self.attention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 467, in forward
query, key = apply_rotary_pos_emb(
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 383, in apply_rotary_pos_emb
q_embed = (q * cos) + (rotate_half(q) * sin)
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 3
Output generated in 6.34 seconds (0.16 tokens/s, 1 tokens, context 1024, seed 1481878986)
Beta Was this translation helpful? Give feedback.
All reactions