Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] #9

Open
YoungSeng opened this issue Dec 16, 2024 · 2 comments

Comments

@YoungSeng
Copy link

I'm encountering a RuntimeError when trying to run a script with the DeepSeek-VL2-tiny model for image and language processing. Specifically, the error occurs during the forward pass in the generate method of the vl_gpt.language object.

Error Message:

Traceback (most recent call last):
  File "/home/yiqiao/Desktop/DeepSeek-VL2/mydemo.py", line 57, in <module>
    outputs = vl_gpt.language.generate(
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/transformers/generation/utils.py", line 2252, in generate
    result = self._sample(
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/transformers/generation/utils.py", line 3251, in _sample
    outputs = self(**model_inputs, return_dict=True)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yiqiao/Desktop/DeepSeek-VL2/deepseek_vl/models/modeling_deepseek.py", line 1723, in forward
    outputs = self.model(
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yiqiao/Desktop/DeepSeek-VL2/deepseek_vl/models/modeling_deepseek.py", line 1592, in forward
    layer_outputs = decoder_layer(
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yiqiao/Desktop/DeepSeek-VL2/deepseek_vl/models/modeling_deepseek.py", line 1306, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/yiqiao/miniconda3/envs/vita/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 309, in forward
    query_states = query_states.view(bsz, q_len, -1, self.head_dim).transpose(1, 2)
RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] because the unspecified dimension size -1 can be any value and is ambiguous

Environment:

torch                             2.5.1+cu121
torchaudio                        2.5.1+cu121
torchvision                       0.20.1+cu121
@robinren03
Copy link

see #4 (comment)

@NB-wo
Copy link

NB-wo commented Dec 25, 2024

see #4 (comment)

I've tried but I get the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants