-
Notifications
You must be signed in to change notification settings - Fork 2.3k
build requirement fails on linux #2282
Comments
Its caused by this: boostorg/boost#502 |
There's a PR being assembled that fixes this, in the mean time you can get a working source tree here.https://github.com/jimmystewpot/ethminer |
I created a smaller pull request(#2288 ) for fixing this by manually specifying the new boost URL, a temporary fix could be to download the boost package to
|
@jimmystewpot any chance this will help with the binary not finding the A100's in my response at original project? (sorry if it is way off topic) |
@hlfritz what's the issue number? I have access to a100s so I can test |
@jimmystewpot mentioned in 2309, 2307. CUDA Error : system not yet initialized even though nvidia-smi sees all the gpu's. if i can provide more details let me know. ubunru 18.04, cuda 11.2, 8ea. A100's. i get the same error with the stable branch or after i try recompiling per 2309. |
add, on my test A100 with 20.04 it just works. Can you confirm the following https://www.supermicro.com/support/faqs/faq.cfm?faq=31029 If that is done then try and write out an strace to file and attach it somehow. It could be a missing file/permission. |
Hmmm. I did not have dcgm installed. It is now, enabled and running. deviceQuery still fails as described in that link (which is for cuda 10, seems things have chnaged - there does not seem to be a nvidia-fabricmanager any longer?). Neither nv-hostengine nor service nvidia-fabricmanager seem to exist after installing DCGM. I also still get the same CUDA Error: system not initialized. Not sure how to write out an strace file? Willing to do so if you can point me in the right direction. Thx! EDIT: dug around on the system a bit. it seems that DCGM service is really just 'nv-hostengine -n'. it is running, but the article says to terminate it. but even though DCGM is installed, there is no nvidia-fabricmanager on the system. |
@jimmystewpot James, thank you very much for the hint. I figured out how to install fabric manager and everything works now! MUCH appreciated. While those specific instructions do not work for newer versions, NVIDIA has good instructions for getting this done with the onboard distribution package managers. Seems this may be an HGX and DGX A100 issue (although DGX comes with fabricmanager already installed for you). Follow this to install/enable the cuda repos: https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html#ubuntu-lts after installing the cuda drivers, follow sections 2.6 and 2.7 in this guide: https://docs.nvidia.com/datacenter/tesla/pdf/fabric-manager-user-guide.pdf Helmut |
error downloading:
The text was updated successfully, but these errors were encountered: