DeepSeek-VL2 can't support old nvidia device like P40 ,1080 , what to do to let this project support these old device? #5

qmzpg · 2024-12-15T14:17:21Z

Add pad token = ['<｜▁pad▁｜>'] to the tokenizer
<｜▁pad▁｜>:2
Add image token = [''] to the tokenizer
:128815
Add grounding-related tokens = ['<|ref|>', '<|/ref|>', '<|det|>', '<|/det|>', '<|grounding|>'] to the tokenizer with input_ids
<|ref|>:128816
<|/ref|>:128817
<|det|>:128818
<|/det|>:128819
<|grounding|>:128820
Add chat tokens = ['<|User|>', '<|Assistant|>'] to the tokenizer with input_ids
<|User|>:128821
<|Assistant|>:128822

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:58<00:00, 7.29s/it]
You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
Traceback (most recent call last):
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/test001.py", line 54, in
inputs_embeds = vl_gpt.prepare_inputs_embeds(**prepare_inputs)
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/deepseek_vl/models/modeling_deepseek_vl_v2.py", line 325, in prepare_inputs_embeds
images_feature = self.vision(total_tiles)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/deepseek_vl/models/siglip_vit.py", line 548, in forward
x = self.forward_features(x)
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/deepseek_vl/models/siglip_vit.py", line 529, in forward_features
x = self.blocks(x)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/container.py", line 219, in forward
input = module(input)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/deepseek_vl/models/siglip_vit.py", line 231, in forward
x = x + self.drop_path1(self.ls1(self.attn(self.norm1(x))))
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/baochaoqian/dnn/test/llm/DeepSeek-VL2-main/deepseek_vl/models/siglip_vit.py", line 143, in forward
x = memory_efficient_attention(q, k, v, p=self.attn_drop.p if self.training else 0.)
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 276, in memory_efficient_attention
return _memory_efficient_attention(
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 403, in _memory_efficient_attention
return _fMHA.apply(
File "/home/baochaoqian/fixfolder001/Anaconda/python310/lib/python3.10/site-packages/torch/autograd/function.py", line 574, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 74, in forward
out, op_ctx = _memory_efficient_attention_forward_requires_grad(
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 428, in _memory_efficient_attention_forward_requires_grad
op = _dispatch_fw(inp, True)
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 119, in _dispatch_fw
return _run_priority_list(
File "/home/baochaoqian/.local/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 55, in _run_priority_list
raise NotImplementedError(msg)
NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:
query : shape=(7, 729, 16, 72) (torch.bfloat16)
key : shape=(7, 729, 16, 72) (torch.bfloat16)
value : shape=(7, 729, 16, 72) (torch.bfloat16)
attn_bias : <class 'NoneType'>
p : 0.0
[email protected] is not supported because:
requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old)
bf16 is only supported on A100+ GPUs
cutlassF-pt is not supported because:
bf16 is only supported on A100+ GPUs
smallkF is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
dtype=torch.bfloat16 (supported: {torch.float32})
bf16 is only supported on A100+ GPUs
unsupported embed per head: 72

The text was updated successfully, but these errors were encountered:

ApolloRay · 2024-12-16T09:22:44Z

bf16，try in A100 A800

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek-VL2 can't support old nvidia device like P40 ,1080 , what to do to let this project support these old device? #5

DeepSeek-VL2 can't support old nvidia device like P40 ,1080 , what to do to let this project support these old device? #5

qmzpg commented Dec 15, 2024

ApolloRay commented Dec 16, 2024

DeepSeek-VL2 can't support old nvidia device like P40 ,1080 , what to do to let this project support these old device? #5

DeepSeek-VL2 can't support old nvidia device like P40 ,1080 , what to do to let this project support these old device? #5

Comments

qmzpg commented Dec 15, 2024

ApolloRay commented Dec 16, 2024