You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v2.1.2+cu118 and v2.1.1+cu118 run into torchdata ImportError: libssl.so.3: cannot open shared object file: No such file or directory, that v2.1.0+cu118 doesn't have an issue with
#1220
Open
justinxzhao opened this issue
Jan 11, 2024
· 1 comment
We are noticing a strange error specifically when using torch2.1.1+cu118 and torch2.1.2+cu118 , that is not an issue with torch2.1.0+cu118.
The error looks like this:
Traceback (most recent call last):
from ludwig.api import LudwigModel
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/api.py", line 41, in <module>
from ludwig.backend import Backend, initialize_backend, provision_preprocessing_workers
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/backend/__init__.py", line 22, in <module>
from ludwig.backend.base import Backend, LocalBackend
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/backend/base.py", line 34, in <module>
from ludwig.data.cache.manager import CacheManager
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/cache/manager.py", line 8, in <module>
from ludwig.data.dataset.base import DatasetManager
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/data/dataset/base.py", line 24, in <module>
from ludwig.distributed import DistributedStrategy
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/distributed/__init__.py", line 3, in <module>
from ludwig.distributed.base import DistributedStrategy, LocalStrategy
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/distributed/base.py", line 11, in <module>
from ludwig.modules.optimization_modules import create_optimizer
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/modules/optimization_modules.py", line 21, in <module>
from ludwig.utils.torch_utils import LudwigModule
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/utils/torch_utils.py", line 14, in <module>
from ludwig.utils.strings_utils import SpecialSymbol
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/utils/strings_utils.py", line 33, in <module>
from ludwig.utils.tokenizers import get_tokenizer_from_registry
File "/home/ray/anaconda3/lib/python3.8/site-packages/ludwig/utils/tokenizers.py", line 21, in <module>
import torchtext
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchtext/__init__.py", line 12, in <module>
from . import data, datasets, prototype, functional, models, nn, transforms, utils, vocab, experimental
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchtext/datasets/__init__.py", line 3, in <module>
from .ag_news import AG_NEWS
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchtext/datasets/ag_news.py", line 5, in <module>
from torchdata.datapipes.iter import FileOpener, IterableWrapper
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchdata/__init__.py", line 7, in <module>
from torchdata import _extension # noqa: F401
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchdata/_extension.py", line 34, in <module>
_init_extension()
File "/home/ray/anaconda3/lib/python3.8/site-packages/torchdata/_extension.py", line 31, in _init_extension
from torchdata import _torchdata as _torchdata
ImportError: libssl.so.3: cannot open shared object file: No such file or directory
It seems like there's some complaint about torchdata, which seems to install with urllib3>2.0.
When trying to install with urllib3==1.26.16 to try to mitigate the libssl.so error, then we get a different error:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1382, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/home/ray/anaconda3/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/ray/anaconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 28, in <module>
from ..integrations.deepspeed import is_deepspeed_zero3_enabled
File "/home/ray/anaconda3/lib/python3.8/site-packages/transformers/integrations/deepspeed.py", line 49, in <module>
from accelerate.utils.deepspeed import HfDeepSpeedConfig as DeepSpeedConfig
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/__init__.py", line 3, in <module>
from .accelerator import Accelerator
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/accelerator.py", line 35, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/utils/__init__.py", line 153, in <module>
from .launch import (
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/utils/launch.py", line 24, in <module>
from ..commands.config.config_args import SageMakerConfig
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/commands/config/__init__.py", line 19, in <module>
from .config import config_command_parser
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/commands/config/config.py", line 25, in <module>
from .sagemaker import get_sagemaker_input
File "/home/ray/anaconda3/lib/python3.8/site-packages/accelerate/commands/config/sagemaker.py", line 35, in <module>
import boto3 # noqa: F401
File "/home/ray/anaconda3/lib/python3.8/site-packages/boto3/__init__.py", line 17, in <module>
from boto3.session import Session
File "/home/ray/anaconda3/lib/python3.8/site-packages/boto3/session.py", line 17, in <module>
import botocore.session
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/session.py", line 26, in <module>
import botocore.client
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/client.py", line 15, in <module>
from botocore import waiter, xform_name
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/waiter.py", line 18, in <module>
from botocore.docs.docstring import WaiterDocstring
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/docs/__init__.py", line 15, in <module>
from botocore.docs.service import ServiceDocumenter
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/docs/service.py", line 14, in <module>
from botocore.docs.client import ClientDocumenter, ClientExceptionsDocumenter
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/docs/client.py", line 14, in <module>
from botocore.docs.example import ResponseExampleDocumenter
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/docs/example.py", line 13, in <module>
from botocore.docs.shape import ShapeDocumenter
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/docs/shape.py", line 19, in <module>
from botocore.utils import is_json_value_header
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/utils.py", line 34, in <module>
import botocore.httpsession
File "/home/ray/anaconda3/lib/python3.8/site-packages/botocore/httpsession.py", line 21, in <module>
from urllib3.util.ssl_ import (
ImportError: cannot import name 'DEFAULT_CIPHERS' from 'urllib3.util.ssl_' (/home/ray/anaconda3/lib/python3.8/site-packages/urllib3/util/ssl_.py)
This suggests a different incompatibility (perhaps from deepspeed?).
Anyway, it seems like torch 2.1.0+cu118 doesn’t require the newest version of torchdata and/or it seems to work with urllib3==1.26.16, which appears to mitigate our issues.
However, the errors when trying to use 2.1.1+cu118 and 2.1.2+cu118 his seemed weird to me, so raising it here in case anyone had any helpful tidbits!
🐛 Describe the bug
We are noticing a strange error specifically when using torch2.1.1+cu118 and torch2.1.2+cu118 , that is not an issue with torch2.1.0+cu118.
The error looks like this:
It seems like there's some complaint about torchdata, which seems to install with urllib3>2.0.
When trying to install with
urllib3==1.26.16
to try to mitigate the libssl.so error, then we get a different error:This suggests a different incompatibility (perhaps from deepspeed?).
Anyway, it seems like torch
2.1.0+cu118
doesn’t require the newest version of torchdata and/or it seems to work withurllib3==1.26.16
, which appears to mitigate our issues.However, the errors when trying to use 2.1.1+cu118 and 2.1.2+cu118 his seemed weird to me, so raising it here in case anyone had any helpful tidbits!
Versions
2.1.0+cu118 (works)
2.1.1+cu118 (broken)
2.1.2+cu118 (broken)
The text was updated successfully, but these errors were encountered: