-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lack of site-packages
breaks assumptions of third-party packages causing friction
#2156
Comments
Hi experts, I'm new to Bazel and having an issue with nvimgcodec. I tried adding it in the preloading function, but preloading doesn't seem to work for me. def _preload_cuda_deps(lib_folder: str, lib_name: str) -> None:
"""Preloads cuda deps if they could not be found otherwise."""
# Should only be called on Linux if default path resolution have failed
assert platform.system() == 'Linux', 'Should only be called on Linux'
import glob
lib_path = None
for path in sys.path:
nvidia_path = os.path.join(path, 'nvidia')
if not os.path.exists(nvidia_path):
continue
print(f"Checking nvidia_path {nvidia_path}")
if "nvimgcodec" == lib_folder:
candidate_lib_paths = glob.glob(os.path.join(nvidia_path, lib_folder, 'libnvimgcodec.so.*[0-9]'))
else:
candidate_lib_paths = glob.glob(os.path.join(nvidia_path, lib_folder, 'lib', lib_name))
print(f"Found candidate_lib_paths {candidate_lib_paths}")
if candidate_lib_paths and not lib_path:
lib_path = candidate_lib_paths[0]
print(f"Found lib_path {lib_path}")
if lib_path:
break
print(f"Preloading {lib_name} from {lib_path}")
if not lib_path:
raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ctypes.CDLL(lib_path)
def preload_cuda_deps() -> None:
cuda_libs: Dict[str, str] = {
'cublas': 'libcublas.so.*[0-9]',
'cudnn': 'libcudnn.so.*[0-9]',
'cuda_nvrtc': 'libnvrtc.so.*[0-9].*[0-9]',
'cuda_runtime': 'libcudart.so.*[0-9].*[0-9]',
'cuda_cupti': 'libcupti.so.*[0-9].*[0-9]',
'cufft': 'libcufft.so.*[0-9]',
'curand': 'libcurand.so.*[0-9]',
'cusolver': 'libcusolver.so.*[0-9]',
'cusparse': 'libcusparse.so.*[0-9]',
'nccl': 'libnccl.so.*[0-9]',
'nvtx': 'libnvToolsExt.so.*[0-9]',
'nvimgcodec': 'libnvimgcodec.so.*[0-9]',
}
for lib_folder, lib_name in cuda_libs.items():
_preload_cuda_deps(lib_folder, lib_name) I have several Nvidia libraries and I wanted to use 'from nvidia import nvimgcodec', but multiple libraries have their own 'nvidia' directory under site-packages (e.g., pip_deps_cublas_cu11/site-packages/nvidia/ and pip_deps_nvimagecodec_cu11/site-packages/nvidia/), and 'from nvidia' always directs me to the cublas library. My workaround is to copy the nvimgcodec library from my local Python environment to the Bazel directory, place it under pip_deps_nvimagecodec_cu11/site-packages/nvidia_img/, and then use 'from nvidia_img import nvimgcodec'. I also tried just copying the nvimgcodec library from pip_deps_nvimagecodec_cu11/site-packages/nvidia and modifying the linking, but that didn't work, so I copied it from my local environment instead. I'm not sure if I can add this as a patch because it doesn't really make sense. Do you know if there's a better solution for this? Thanks so much for your help! |
By the way, for this, I mean I could use 'from nvidia_img import nvimgcodec,' but seems the library is not initialized correctly. When I try to run the sample code to get a decoder, it seems that I just get None. I'm not sure if it's related to the copying and re-linking. from nvidia import nvimgcodec
decoder = nvimgcodec.Decoder() |
Could someone comment in which version of rules_python this is not broken? PyTorch did work before, at the very minimum. It'd be great to know if there's a rollback path. |
|
Rules python has always worked this way. So yea it's not a regression. |
Context
This is a tracking issue to recognise that the lack of a
site-packages
layout causes friction when making use of third-party distribution packages (wheels and sdists) from indexes such as PyPI.Outside bazel and rules_python, it is common for distribution packages to assume that they will be installed into a site-packages layout, either in a "virtual environment" or directly into a python user or global site installation.
Notable examples are the libraries in the AI / ML ecosystem that make use of the
nvidia
CUDA shared libraries. These shared libraries contain relativerpath
in the ELF/Mach-O/DLL which fail when not installed as siblings in asite-packages
layout.Another rare issue is failure to load
*.pth
files. Python provides Site-specific configuration hooks that can customize thesys.path
at startup. rules_python could workaround this issue perhaps, but if asite-packages
layout was used and discovered by the interpreter at startup, no workarounds would be necessary.Distribution packages on PyPI known to have issues:
Known workarounds
Related
The text was updated successfully, but these errors were encountered: