Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #820

kavinwkp · 2023-11-18T12:58:59Z

Bug description

Traceback (most recent call last):
  File "/home/kavin/Documents/PycharmProjects/RL/Imitation/example.py", line 150, in <module>
    bc_trainer.train(n_epochs=1)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 495, in train
    training_metrics = self.loss_calculator(self.policy, obs_tensor, acts)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 130, in __call__
    (_, log_prob, entropy) = policy.evaluate_actions(
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/stable_baselines3/common/policies.py", line 736, in evaluate_actions
    log_prob = distribution.log_prob(actions)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/stable_baselines3/common/distributions.py", line 292, in log_prob
    return self.distribution.log_prob(actions)
  File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/torch/distributions/categorical.py", line 127, in log_prob
    return log_pmf.gather(-1, value).squeeze(-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

It seems that it cannot train with a GPU. I got this error when I set the device to "cuda:0", but not when I set it to "cpu".

Steps to reproduce

print("Loading expert demonstrations...")
rng = np.random.default_rng(0)

env = gym.make("CartPole-v1")
venv = make_vec_env("CartPole-v1", post_wrappers=[lambda env, _: RolloutInfoWrapper(env)], rng=rng)
expertAgent = PPO.load("ppo_cartpole.zip", print_system_info=False)

print("Rollouts...")
rollouts = rollout.rollout(
    expertAgent,
    venv,
    rollout.make_sample_until(min_timesteps=None, min_episodes=60),
    rng=rng,
)

bc_trainer = bc.BC(
    observation_space=venv.observation_space,
    action_space=venv.action_space,
    demonstrations=rollouts,
    rng=rng,
    device="cuda:0",
)

reward, _ = evaluate_policy(
    bc_trainer.policy,
    venv,
    n_eval_episodes=3,
    render=False,
)
print(f"Reward before training: {reward}")

print("Training a policy using Behavior Cloning")
bc_trainer.train(n_epochs=1)

reward, _ = evaluate_policy(
    bc_trainer.policy,
    env,
    n_eval_episodes=3,
    render=False,
)
print(f"Reward after training: {reward}")

Environment

Operating system and version: Ubuntu 20.04
Python version: 3.8
Output of pip freeze --all:
stable-baselines3==2.2.0

The text was updated successfully, but these errors were encountered:

Rajesh-Siraskar · 2023-12-16T08:31:36Z

I get this error too. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA_gather)

It appears on call to train(), at least on two algos. dagger_trainer.train() and bc_trainer.train()

imitation version 1.0

tvietphu · 2024-04-04T05:00:36Z

This error occur because the function safe_to_tensor in imitation/src/imitation/util/util.py return a tensor without transferring to 'cuda'.

fix the safe_to_tensor function:
replace line 259: return array
by the code: return th.as_tensor(array, **kwargs)

kavinwkp added the bug Something isn't working label Nov 18, 2023

ernestum linked a pull request Dec 15, 2023 that will close this issue

Ensure safe_to_tensor moves tensors to the specified device. #831

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #820

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #820

kavinwkp commented Nov 18, 2023

Rajesh-Siraskar commented Dec 16, 2023 •

edited

Loading

tvietphu commented Apr 4, 2024

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #820

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #820

Comments

kavinwkp commented Nov 18, 2023

Bug description

Steps to reproduce

Environment

Rajesh-Siraskar commented Dec 16, 2023 • edited Loading

tvietphu commented Apr 4, 2024

Rajesh-Siraskar commented Dec 16, 2023 •

edited

Loading