Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 昇腾910微调internLM报错 #212

Open
rourouZ opened this issue Apr 23, 2024 · 3 comments
Open

[Bug] 昇腾910微调internLM报错 #212

rourouZ opened this issue Apr 23, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@rourouZ
Copy link

rourouZ commented Apr 23, 2024

Describe the bug

Traceback (most recent call last):
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/pool.py", line 131, in worker
put((job, i, result))
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/queues.py", line 368, in put
self._writer.send_bytes(obj)
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
BrokenPipeError: [Errno 32] Broken pipe
File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)

During handling of the above exception, another exception occurred:

File "/root/miniconda3/envs/internLM/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
Traceback (most recent call last):
BrokenPipeError: [Errno 32] Broken pipe

Environment

python==3.8
torch==2.0.1

Other information

No response

@rourouZ rourouZ added the bug Something isn't working label Apr 23, 2024
@gaoyang07 gaoyang07 assigned SolenoidWGT and unassigned yhcc Apr 24, 2024
@SolenoidWGT
Copy link
Contributor

hello @rourouZ ,您好,看起来torchnpu输出的报错堆栈包含的有效信息不多,我们这边适配华为NPU使用的环境是:

        torch: 2.1.0+cpu
        torch_npu: 2.1.0.post3+git7c4136d
        cann: 8.0.RC1.alpha003

您可以试试用这个环境跑下,我这边测试应该是ok的,如果您有任何问题internlm交流群@我也可以

@weiliangxiong
Copy link

可以麻烦提供下运行成功的npu镜像吗?多谢!

@li126com
Copy link
Collaborator

可以麻烦提供下运行成功的npu镜像吗?多谢!

可以试下这个 docker pull internlm/opencompass:opencompass-20240607

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants