Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent CUDA toolkit path: /usr vs /usr/lib #295

Open
jimlloyd opened this issue Jul 25, 2022 · 2 comments
Open

Inconsistent CUDA toolkit path: /usr vs /usr/lib #295

jimlloyd opened this issue Jul 25, 2022 · 2 comments

Comments

@jimlloyd
Copy link

I believe this problem is probably the fault of the tensorflow configure scripts rather than anything specific to tensorflow_cc but I am hoping perhaps someone might have information for how to work around the problem.

The problem is that after doing cd tensorflow_cc && mkdir build && cd build && cmake .. && make the make fails with this error:

Inconsistent CUDA toolkit path: /usr vs /usr/lib
Asking for detailed CUDA configuration...

I have been trying to install onto a freshly created Ubuntu 20.04 or 22.04. I have tried various methods of installing the CUDA and CUdnn and all methods tried have resulted in this error.

By the way, the first method I tried was to use the Lambda Stack on 22.04. It would be awesome if tensorflow_cc was compatible with Lambda Stack. But when I discovered this "Inconsistent CUDA toolkit path" problem I concluded that Lambda Stack probably somehow altered the paths at which CUDA and cudnn were installed so I switched to more standard ways of installing. I have since learned that I run into the same problem when not using Lambda Stack, so I am hopeful that once I figure out how to solve the problem I will be able to use Lambda Stack.

My most recent attempt was with 20.04. I installed:

  1. NVidia drivers using the GUI "Additional Drivers" utility.
  2. CUDA 10.1 using sudo apt install nvidia-cuda-toolkit.
  3. cuDNN by downloading cudnn-10.1-linux-x64-v8.0.5.39 from NVidia's website and following the instructions to untar and then copy the components into /usr/local/cuda/...

FYI I have of course spent time searching for information about this exact problem "Inconsistent CUDA toolkit path:". I know it is an exception thrown from tensorflow/third_party/gpus/find_cuda_config.py. The problematic code is commented:

# XLA requires the toolkit path to find ptxas and libdevice.
# TODO(csigg): pass in both directories instead.

I have tried various hacks with the code, including simply commenting out the code that raises the exception, which allows the build to proceed but eventually results in a similar exception being raised, presumably when building XLA.

Does anyone know how to workaround this problem?

@jimlloyd
Copy link
Author

Within a few minutes after writing this I did more searching and found this issue:

tensorflow/tensorflow#40202

There is a comment: "A more reliable workaround is to install the cuda toolkit using Nvidia's .run file installer."

I'm going to try that.

@FloopCZ
Copy link
Owner

FloopCZ commented Jul 26, 2022

Yes, that may be the way or you could take a look at the official NVIDIA CUDA Docker image source on which we run the CI: https://gitlab.com/nvidia/container-images/cuda/blob/master/dist/11.7.0/ubuntu2204/runtime/Dockerfile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants