RuntimeError: The size of tensor unmatched #4018

HisuiAlche · 2023-09-21T11:53:10Z

HisuiAlche
Sep 21, 2023

Hello. Thanks to WebUI, I've been enjoying using the text generation AI.
I have a question regarding an issue I encountered while using the japanese-stablelm-instruct-alpha-7b (the 4-bit model).
I'm experiencing a runtime error, which doesn't occur immediately upon startup but tends to happen after a certain amount of conversation.
Once it occurs, I can no longer continue the conversation.
Deleting the conversation log seems to resolve the error, suggesting that there might be some error during the compression of the conversation log.
I've tried adjusting parameters in an attempt to resolve the error, but I couldn't find any parameters that would improve the situation.
Do you know of any solutions to resolve this error?

I'll provide the startup command and traceback below.

execution command
!python server.py --share --settings /content/drive/MyDrive/settings.yaml --wbits 4 --groupsize 128 --trust-remote-code --model_type gpt_neox --model /content/drive/MyDrive/text-generation-webui/models/japanese-stablelm-instruct-alpha-7b-s

traceback

Traceback (most recent call last):
File "/content/drive/MyDrive/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/content/drive/MyDrive/text-generation-webui/modules/text_generation.py", line 323, in generate_with_callback
shared.model.generate(**kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1642, in generate
return self.sample(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2724, in sample
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 601, in forward
outputs = self.transformer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 203, in forward
outputs = layer(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 260, in forward
attention_layer_outputs = self.attention(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 467, in forward
query, key = apply_rotary_pos_emb(
File "/root/.cache/huggingface/modules/transformers_modules/japanese-stablelm-instruct-alpha-7b-s/modeling_japanese_stablelm_alpha.py", line 383, in apply_rotary_pos_emb
q_embed = (q * cos) + (rotate_half(q) * sin)
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 3
Output generated in 6.34 seconds (0.16 tokens/s, 1 tokens, context 1024, seed 1481878986)

devzzzero · 2024-05-13T16:01:31Z

devzzzero
May 13, 2024

I am seeing this error as well. This happens with multiple models and multiple model loaders (i.e. llama.cpp, ExLLama). I've seen this model with Nous-Hermes-13B-SuperHOT-8K-GPTQ, brittlewis12_Kunoichi-DPO-v2-7B-GGUF, TheBloke_Echidna-13B-v0.2-GGUF, Blue-Orchid-2x7b_GGUF, etc...)

I tried increasing the context len (with appropriate rope scaling values), but no luck.
Oddly, this happens even though the reported context length of a conv chain (say 3460) plus new tokens (say, 300) is LESS THAN the max default context length of the model (say 4096). Sometimes regenerating helps, but other times ooga seems to get stuck.

So it turns out that the error, in my case was silero_tts barfing on long sequences i.e. #3653

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: The size of tensor unmatched #4018

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

RuntimeError: The size of tensor unmatched #4018

HisuiAlche Sep 21, 2023

traceback

Replies: 1 comment

devzzzero May 13, 2024

HisuiAlche
Sep 21, 2023

devzzzero
May 13, 2024