Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on page /_modules/ray/tune/integration/pytorch_lightning.html #33708

Closed
actuallyaswin opened this issue Mar 25, 2023 · 4 comments
Closed
Labels
stale The issue is stale. It will be closed within 7 days unless there are further conversation

Comments

@actuallyaswin
Copy link

I hit an issue when using TuneReportCallback in the context of PyTorch Lightning.

2023-03-24 22:49:09,709 INFO worker.py:1553 -- Started a local Ray instance.                                                                       
== Status ==                                                                                                                                       
Current time: 2023-03-24 22:49:23 (running for 00:00:12.52)
Memory usage on this node: 38.1/251.8 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/32 CPUs, 0.1/2 GPUs, 0.0/109.27 GiB heap, 0.0/50.82 GiB objects (0.0/1.0 accelerator_type:A40)
Result logdir: ...
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+------------------------+
| Trial name             | status   | loc                    |
|------------------------+----------+------------------------|
| experiment_9efa3_00000 | RUNNING  | 129.79.247.102:3416524 |
+------------------------+----------+------------------------+


(experiment pid=3416524) Global seed set to 0
2023-03-24 22:49:24,560 ERROR trial_runner.py:1062 -- Trial experiment_9efa3_00000: Error processing event.
ray.exceptions.RayTaskError(ValueError): ray::ImplicitFunc.train() (pid=3416524, ip=129.79.247.102, repr=experiment)
  File "/home/asivara/venv/lib/python3.8/site-packages/ray/tune/trainable/trainable.py", line 368, in train
    raise skipped from exception_cause(skipped)
  File "/home/asivara/venv/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 337, in entrypoint
    return self._trainable_func(
  File "/home/asivara/venv/lib/python3.8/site-packages/ray/tune/trainable/function_trainable.py", line 654, in _trainable_func
    output = fn()
  File "train.py", line 280, in experiment
    trainer = Trainer(
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/utilities/argparse.py", line 69, in insert_env_defaults
    return fn(self, **kwargs)
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/trainer/trainer.py", line 421, in __init__
    self._callback_connector.on_trainer_init(
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 79, in on_trainer_init
    _validate_callbacks_list(self.trainer.callbacks)
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 252, in _validate_callback
s_list
    stateful_callbacks = [cb for cb in callbacks if is_overridden("state_dict", instance=cb)]
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/trainer/connectors/callback_connector.py", line 252, in <listcomp>
    stateful_callbacks = [cb for cb in callbacks if is_overridden("state_dict", instance=cb)]
  File "/home/asivara/venv/lib/python3.8/site-packages/lightning/pytorch/utilities/model_helpers.py", line 34, in is_overridden
    raise ValueError("Expected a parent")
ValueError: Expected a parent

Based on this PyTorch thread, I think the imports on ray.tune.integration.pytorch_lightning need to be updated to Lightning 2.

@beasteers
Copy link

(I also just hit this issue 😭)

@stale
Copy link

stale bot commented Aug 12, 2023

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Aug 12, 2023
@stale
Copy link

stale bot commented Oct 15, 2023

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this as completed Oct 15, 2023
@championsnet
Copy link

championsnet commented Apr 11, 2024

Based on this PyTorch thread, I think the imports on ray.tune.integration.pytorch_lightning need to be updated to Lightning 2.

According to this thread in the Lightning repo, this error appears when you include callbacks that have pytorch_lightning imports.

So the fix is to fork the current implementation and update the import yourself I guess.

UPDATE: Should be fixed in #44339

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale The issue is stale. It will be closed within 7 days unless there are further conversation
Projects
None yet
Development

No branches or pull requests

3 participants